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PREFACE 


The study of real analysis is indispensable for a prospective graduate student of pure or 
applied mathematics. It also has great value for any student who wishes to go beyond the 
routine manipulations of formulas because it develops the ability to think deductively, 
analyze mathematical situations and extend ideas to new contexts. Mathematics has 
become valuable in many areas, including economics and management science as well 
as the physical sciences, engineering, and computer science. This book was written to 
provide an accessible, reasonably paced treatment of the basic concepts and techniques of 
real analysis for students in these areas. While students will find this book challenging, 
experience has demonstrated that serious students are fully capable of mastering the 
material. 

The first three editions were very well received and this edition maintains the same 
spirit and user-friendly approach as earlier editions. Every section has been examined. 
Some sections have been revised, new examples and exercises have been added, and a new 
section on the Darboux approach to the integral has been added to Chapter 7. There is more 
material than can be covered in a semester and instructors will need to make selections and 
perhaps use certain topics as honors or extra credit projects. 

To provide some help for students in analyzing proofs of theorems, there is an 
appendix on “Logic and Proofs” that discusses topics such as implications, negations, 
contrapositives, and different types of proofs. However, it is a more useful experience to 
learn how to construct proofs by first watching and then doing than by reading about 
techniques of proof. 

Results and proofs are given at a medium level of generality. For instance, continuous 
functions on closed, bounded intervals are studied in detail, but the proofs can be readily 
adapted to a more general situation. This approach is used to advantage in Chapter 11 
where topological concepts are discussed. There are a large number of examples to 
illustrate the concepts, and extensive lists of exercises to challenge students and to aid them 
in understanding the significance of the theorems. 

Chapter 1 has a brief summary of the notions and notations for sets and functions that 
will be used. A discussion of Mathematical Induction is given, since inductive proofs arise 
frequently. There is also a section on finite, countable and infinite sets. This chapter can 
used to provide some practice in proofs, or covered quickly, or used as background material 
and returning later as necessary. 

Chapter 2 presents the properties of the real number system. The first two sections deal 
with Algebraic and Order properties, and the crucial Completeness Property is given in 
Section 2.3 as the Supremum Property. Its ramifications are discussed throughout the 
remainder of the chapter. 

In Chapter 3, a thorough treatment of sequences is given, along with the associated 
limit concepts. The material is of the greatest importance. Students find it rather natural 
though it takes time for them to become accustomed to the use of epsilon. A brief 
introduction to Infinite Series is given in Section 3.7, with more advanced material 
presented in Chapter 9. 
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Chapter 4 on limits of functions and Chapter 5 on continuous functions constitute the 
heart of the book. The discussion of limits and continuity relies heavily on the use of 
sequences, and the closely parallel approach of these chapters reinforces the understanding 
of these essential topics. The fundamental properties of continuous functions on intervals 
are discussed in Sections 5.3 and 5.4. The notion of a gauge is introduced in Section 5.5 and 
used to give alternate proofs of these theorems. Monotone functions are discussed in 
Section 5.6. 

The basic theory of the derivative is given in the first part of Chapter 6. This material is 
standard, except a result of Carathéodory is used to give simpler proofs of the Chain Rule 
and the Inversion Theorem. The remainder of the chapter consists of applications of the 
Mean Value Theorem and may be explored as time permits. 

In Chapter 7, the Riemann integral is defined in Section 7.1 as a limit of Riemann 
sums. This has the advantage that it is consistent with the students’ first exposure to the 
integral in calculus, and since it is not dependent on order properties, it permits immediate 
generalization to complex- and vector-values functions that students may encounter in later 
courses. It is also consistent with the generalized Riemann integral that is discussed in 
Chapter 10. Sections 7.2 and 7.3 develop properties of the integral and establish the 
Fundamental Theorem of Calculus. The new Section 7.4, added in response to requests 
from a number of instructors, develops the Darboux approach to the integral in terms of 
upper and lower integrals, and the connection between the two definitions of the integral is 
established. Section 7.5 gives a brief discussion of numerical methods of calculating the 
integral of continuous functions. 

Sequences of functions and uniform convergence are discussed in the first two sections 
of Chapter 8, and the basic transcendental functions are put on a firm foundation in 
Sections 8.3 and 8.4. Chapter 9 completes the discussion of infinite series that was begun 
in Section 3.7. Chapters 8 and 9 are intrinsically important, and they also show how the 
material in the earlier chapters can be applied. 

Chapter 10 is a presentation of the generalized Riemann integral (sometimes called the 
“Henstock-Kurzweil” or the “gauge” integral). It will be new to many readers and they 
will be amazed that such an apparently minor modification of the definition of the Riemann 
integral can lead to an integral that is more general than the Lebesgue integral. This 
relatively new approach to integration theory is both accessible and exciting to anyone who 
has studied the basic Riemann integral. 

Chapter 11 deals with topological concepts. Earlier theorems and proofs are extended 
to a more abstract setting. For example, the concept of compactness is given proper 
emphasis and metric spaces are introduced. This chapter will be useful to students 
continuing on to graduate courses in mathematics. 

There are lengthy lists of exercises, some easy and some challenging, and “hints” to 
many of them are provided to help students get started or to check their answers. More 
complete solutions of almost every exercise are given in a separate Instructor’s Manual, 
which is available to teachers upon request to the publisher. 

It is a satisfying experience to see how the mathematical maturity of the students 
increases as they gradually learn to work comfortably with concepts that initially seemed 
so mysterious. But there is no doubt that a lot of hard work is required on the part of both the 
students and the teachers. 

Brief biographical sketches of some famous mathematicians are included to enrich the 
historical perspective of the book. Thanks go to Dr. Patrick Muldowney for his photograph 
of Professors Henstock and Kurzweil, and to John Wiley & Sons for obtaining portraits of 
the other mathematicians. 
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Many helpful comments have been received from colleagues who have taught from 
earlier editions of this book and their remarks and suggestions have been appreciated. I 
wish to thank them and express the hope that they find this new edition even more helpful 
than the earlier ones. 


November 20, 2010 


Urbana, Illinois Donald R. Sherbert 
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CHAPTER 1 


PRELIMINARIES 


In this initial chapter we will present the background needed for the study of real 
analysis. Section 1.1 consists of a brief survey of set operations and functions, two vital 
tools for all of mathematics. In it we establish the notation and state the basic 
definitions and properties that will be used throughout the book. We will regard the 
word “set”? as synonymous with the words “‘class,” “collection,” and “family,” and 
we will not define these terms or give a list of axioms for set theory. This approach, 
often referred to as “naive” set theory, is quite adequate for working with sets in the 
context of real analysis. 

Section 1.2 is concerned with a special method of proof called Mathematical 
Induction. It is related to the fundamental properties of the natural number system and, 
though it is restricted to proving particular types of statements, it is important and used 
frequently. An informal discussion of the different types of proofs that are used in 
mathematics, such as contrapositives and proofs by contradiction, can be found in 
Appendix A. 

In Section 1.3 we apply some of the tools presented in the first two sections of this 
chapter to a discussion of what it means for a set to be finite or infinite. Careful definitions 
are given and some basic consequences of these definitions are derived. The important 
result that the set of rational numbers is countably infinite is established. 

In addition to introducing basic concepts and establishing terminology and notation, 
this chapter also provides the reader with some initial experience in working with precise 
definitions and writing proofs. The careful study of real analysis unavoidably entails the 
reading and writing of proofs, and like any skill, it is necessary to practice. This chapter is a 
starting point. 


” 


Section 1.1 Sets and Functions 


To the reader: In this section we give a brief review of the terminology and notation that 
will be used in this text. We suggest that you look through it quickly and come back later 
when you need to recall the meaning of a term or a symbol. 

If an element x is in a set A, we write 


xEA 
and say that x is a member of A, or that x belongs to A. If x is not in A, we write 
x éA. 
If every element of a set A also belongs to a set B, we say that A is a subset of B and write 


ACB or BDA. 


=" 
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We say that a set A is a proper subset of a set Bif A C B, but there is at least one element of 
B that is not in A. In this case we sometimes write 


ACB. 


1.1.1 Definition Two sets A and B are said to be equal, and we write A = B, if they 
contain the same elements. 


Thus, to prove that the sets A and B are equal, we must show that 
ACB and BCA. 


A set is normally defined by either listing its elements explicitly, or by specifying a 
property that determines the elements of the set. If P denotes a property that is meaningful 
and unambiguous for elements of a set S, then we write 


{x €S: P(x)} 


for the set of all elements x in S for which the property P is true. If the set S is understood 
from the context, then it is often omitted in this notation. 

Several special sets are used throughout this book, and they are denoted by standard 
symbols. (We will use the symbol := to mean that the symbol on the left is being defined 
by the symbol on the right.) 

e The set of natural numbers N := {1, 2, 3,...}, 

e The set of integers Z := {0, 1, — 1, 2, —2,...}, 

e The set of rational numbers Q := {m/n :m, n € Zandn F 0}, 
e The set of real numbers R. 


The set R of real numbers is of fundamental importance and will be discussed at length 
in Chapter 2. 


1.1.2 Examples (a) The set 
{x EN A xX —3x+2=0} 


consists of those natural numbers satisfying the stated equation. Since the only solutions of 
this quadratic equation are x = 1 and x = 2, we can denote this set more simply by {1, 2}. 


(b) A natural number n is even if it has the form n = 2k for some k € N. The set of even 
natural numbers can be written 


{2k : k € N}, 


which is less cumbersome than {n € N : n = 2k, k € N}. Similarly, the set of odd natural 
numbers can be written 


(2k-1: kN}. 


Set Operations 


We now define the methods of obtaining new sets from given ones. Note that these set 
operations are based on the meaning of the words “‘or,” “and,” and “not.” For the union, it 
is important to be aware of the fact that the word “or” is used in the inclusive sense, 
allowing the possibility that x may belong to both sets. In legal terminology, this inclusive 
sense is sometimes indicated by “‘and/or.”’ 
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1.1.3 Definition (a) The union of sets A and B is the set 
AUB :={x:x €Aorxe€ B}. 
(b) The intersection of the sets A and B is the set 
ANB:={x:x € Aandx EB}. 
(c) The complement of B relative to A is the set 


A\B := {x : x € Aand x ¢ B}. 


a, 


AUB W ANB ZA A\B B 
Figure 1.1.1 (a) AUB  (b)ANB (c) A\B 


The set that has no elements is called the empty set and is denoted by the symbol 0. 
Two sets A and B are said to be disjoint if they have no elements in common; this can be 
expressed by writing A N B = 9. 

To illustrate the method of proving set equalities, we will next establish one of the 
De Morgan laws for three sets. The proof of the other one is left as an exercise. 


1.1.4 Theorem /f A, B, C are sets, then 


(a) A\(BUC) = (A\B) A (A\C), 
(b) A\(BOC) = (A\B) U(A\C). 


Proof. To prove (a), we will show that every element in A\(B U C) is contained in both 
(A\B) and (A\C), and conversely. 

If xis in A\(B U C), then x is in A, but x is not in B U C. Hence x is in A, but x is neither 
in B nor in C. Therefore, x is in A but not B, and x is in A but not C. Thus, x € A\B and 
x € A\C, which shows that x € (A\B) A (A\C). 

Conversely, if x € (A\B) N (A\C), then x € (A\B) and x € (A\C). Hence x € A and 
both x ¢ B and x ¢ C. Therefore, x € A and x ¢ (BUC), so that x € A\(B U C). 

Since the sets (A\B) N (A\C) and A\(B U C) contain the same elements, they are 
equal by Definition 1.1.1. QED. 


There are times when it is desirable to form unions and intersections of more than two 
sets. For a finite collection of sets {A,, A2, . . . , An}, their union is the set A consisting of 
all elements that belong to at least one of the sets Ax, and their intersection consists of all 
elements that belong to all of the sets Ax. 

This is extended to an infinite collection of sets {A,, A>, ...,A,, ... } as follows. 
Their union is the set of elements that belong to at least one of the sets A,. In this case we 
write 


LJ An := {x : x € An for somen € N}. 


n=l 
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Similarly, their intersection is the set of elements that belong to all of these sets A,,. In this 
case we write 


An := {x : x € A, for alln € N}. 


-8 


n=1 


Functions 


In order to discuss functions, we define the Cartesian product of two sets. 


1.1.5 Definition If A and B are nonempty sets, then the Cartesian product A x B of A 
and B is the set of all ordered pairs (a, b) with a € A and b € B. That is, 


Ax B:= {(a, b):a€ A, b €B}. 


Thus if A = {1, 2,3} and B = {1,5}, then the set A x B is the set whose elements are 
the ordered pairs 


(1,1), (1,5), (2,1), (2,5), (3,1), (3,5). 


We may visualize the set A x B as the set of six points in the plane with the coordinates that 
we have just listed. 

We often draw a diagram (such as Figure 1.1.2) to indicate the Cartesian product of 
two sets A and B. However, it should be realized that this diagram may be a simplification. 
For example, if A := {x €R:1<x<2}andB:={yER: 0<y<lor2<y< 3}, 
then instead of a rectangle, we should have a drawing such as Figure 1.1.3. 


1 2 
Figure 1.1.2 Figure 1.1.3 


We will now discuss the fundamental notion of a function or a mapping. 

To the mathematician of the early nineteenth century, the word “function” meant a 
definite formula, such as f(x) := x? + 3x — 5, which associates to each real number x 
another number f(x). (Here, f(0) = —5, f(1) = —1, f(5) = 35.) This understanding 
excluded the case of different formulas on different intervals, so that functions could not be 
defined ‘‘in pieces.” 

As mathematics developed, it became clear that a more general definition of 
“function”? would be useful. It also became evident that it is important to make a clear 
distinction between the function itself and the values of the function. A revised definition 
might be: 
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A function f from a set A into a set B is a rule of correspondence that assigns to 
each element x in A a uniquely determined element f(x) in B. 


But however suggestive this revised definition might be, there is the difficulty of 
interpreting the phrase “rule of correspondence.” In order to clarify this, we will express 
the definition entirely in terms of sets; in effect, we will define a function to be its graph. 
While this has the disadvantage of being somewhat artificial, it has the advantage of being 
unambiguous and clearer. 


1.1.6 Definition Let A and B be sets. Then a function from A to B is a set f of ordered 
pairs in A x B such that for each a € A there exists a unique b € B with (a, b) € f. (In other 
words, if (a, b) € f and (a, b’) € f, then b = b’,) 


The set A of first elements of a function fis called the domain of f and is often denoted 
by D(f). The set of all second elements in f is called the range of f and is often denoted by 
R(f). Note that, although D(f) = A, we only have R(f) C B. (See Figure 1.1.4.) 

The essential condition that: 


(a,b)ef and (a,b')€f — impliesthat b=D' 


is sometimes called the vertical line test. In geometrical terms it says every vertical line 
x = a with a € A intersects the graph of f exactly once. 
The notation 


f:A-B 


is often used to indicate that fis a function from A into B. We will also say that fis a 
mapping of A into B, or that f maps A into B. If (a, b) is an element in f, it is customary to 
write 


b=f(a) or sometimes amb. 


Figure 1.1.4 A function as a graph 


If b = f(a), we often refer to b as the value of f at a, or as the image of a under f. 


Transformations and Machines 


Aside from using graphs, we can visualize a function as a transformation of the set D(f) = 
A into the set R(f) C B. In this phraseology, when (a, b) € f, we think of f as taking the 
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element a from A and “transforming” or “‘mapping” it into an element b = f(a) in 
R(f) C B. We often draw a diagram, such as Figure 1.1.5, even when the sets A and B are 
not subsets of the plane. 


l, >~ 


D b= fia 


Figure 1.1.5 A function as a transformation 


There is another way of visualizing a function: namely, as a machine that accepts 
elements of D( f) = A as inputs and produces corresponding elements of R(f) C Bas outputs. 
If we take an element x € D(f) and put it into f, then out comes the corresponding value f(x). 
If we put a different element y € D(f) into f, then out comes f(y), which may or may not differ 
from (x). If we try to insert something that does not belong to D( f) into f, we find that it is not 
accepted, for f can operate only on elements from D(f). (See Figure 1.1.6.) 


tad 


| 


f (x) 


Figure 1.1.6 A function as a machine 


This last visualization makes clear the distinction between f and f(x): the first is the 
machine itself, and the second is the output of the machine f when x is the input. Whereas 
no one is likely to confuse a meat grinder with ground meat, enough people have confused 
functions with their values that it is worth distinguishing between them notationally. 


Direct and Inverse Images 


Let f: A — B be a function with domain D(f) = A and range R(f) C B 


1.1.7 Definition If £ is a subset of A, then the direct image of E under fis the subset f (E) 
of B given by 


FE) == {f (x) : x € E}. 
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If H is a subset of B, then the inverse image of H under fis the subset f~ '(H) of A given by 
f (H) := {x €A: f(x) € H}. 


Remark The notation f ~!(H) used in this connection has its disadvantages. However, we 
will use it since it is the standard notation. 


Thus, if we are given a set E C A, then a point y; € B is in the direct image f (E) if and 
only if there exists at least one point x; € E such that yı = f(x). Similarly, given a set 
H C B, then a point x3 is in the inverse image f =H) if and only if y2 := f (x2) belongs to H. 
(See Figure 1.1.7.) 


Figure 1.1.7 Direct and inverse images 


1.1.8 Examples (a) Letf : R — R be defined by f(x) := x”. Then the direct image of 
the set E := {x : 0 < x < 2} is the set f(E) = {y:0 < y < 4}. 

If G := {y : 0 < y < 4}, then the inverse image of G is the set f ~! (G) = {x : —2 < 
x < 2}. Thus, in this case, we see that f ~'(f(E)) £ E. 

On the other hand, we have f ( f ~'(G)) = G. But if H := {y : —1 < y < 1}, then we 
have f(f ~'(H)) ={y:0<y< 1} #H. 

A sketch of the graph of f may help to visualize these sets. 
(b) Let f: A — B, and let G, H be subsets of B. We will show that 


f (GOH) Cf '(G)nf (H). 


For, if x € f ~'(GN H), then f(x) € GN H, so that f(x) € Gand f(x) € H. But this implies 
that x € f '(G) and x € f ~! (H), whence x € f ~! (G) Nf ~'(A). Thus the stated impli- 
cation is proved. [The opposite inclusion is also true, so that we actually have set equality 
between these sets; see Exercise 15.] 


Further facts about direct and inverse images are given in the exercises. 


Special Types of Functions 


The following definitions identify some very important types of functions. 


1.1.9 Definition Let f: A — B be a function from A to B. 


(a) The function f is said to be injective (or to be one-one) if whenever x, 4 x2, then 
f (x1) #f(x2). If fis an injective function, we also say that f is an injection. 

(b) The function fis said to be surjective (or to map A onto B) if f(A) = B; that is, if the 
range R(f) = B. If fis a surjective function, we also say that f is a surjection. 
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(c) If fis both injective and surjective, then f is said to be bijective. If f is bijective, we 
also say that f is a bijection. 


e In order to prove that a function f is injective, we must establish that: 
for allx,, x. in A, if f(x,) =f(x2), then x; = x2. 
To do this we assume that f(x,) = f(x2) and show that x; = x2. 
[In other words, the graph of f satisfies the first horizontal line test: Every horizontal line 
y = b with b € B intersects the graph f in at most one point.] 
e To prove that a function fis surjective, we must show that for any b € B there exists at 
least one x € A such that f(x) = b. 


[In other words, the graph of f satisfies the second horizontal line test: Every horizontal 
line y = b with b € B intersects the graph f in at least one point.] 


1.1.10 Example LetA := {x € R : x # 1} and define f(x) := 2x/(x — 1) for all x € A. 
To show that fis injective, we take xı and x2 in A and assume that f (x1) = f (x2). Thus we 
have 


2x1 2X9 


e 6.305 = 1" 
which implies that x; (x2 — 1) = x2(x; — 1), and hence x; = x2. Therefore f is injective. 
To determine the range of f, we solve the equation y = 2x/(x — 1) for x in terms of y. 


We obtain x = y/(y — 2), which is meaningful for y 4 2. Thus the range of f is the set 
B := {y € R : y Æ 2}. Thus, fis a bijection of A onto B. 


Inverse Functions 


If fis a function from A into B, then fis a special subset of A x B (namely, one passing the 
vertical line test.) The set of ordered pairs in B x A obtained by interchanging the members 
of ordered pairs in fis not generally a function. (That is, the set f may not pass both of the 
horizontal line tests.) However, if f is a bijection, then this interchange does lead to a 
function, called the “inverse function” of f. 


1.1.11 Definition If f: A — B is a bijection of A onto B, then 
g:={(b, a) €BxA:(a, b) Ef} 

is a function on B into A. This function is called the inverse function of f, and is denoted 
by f -1 The function f~" is also called the inverse of f. 

We can also express the connection between f and its inverse f = by noting that 
D(f) = R(f—') and R(f) = D(f~') and that 

b=f(a)  ifandonlyif a=f'(b). 

For example, we saw in Example 1.1.10 that the function 


Co 


i 


is a bijection of A := {x € R : x £ 1} onto the set B := {y E€ R : y # 2}. Solving y = f(x) 
for x in terms of y, we find the function inverse to f is given by 


fO) a for yeB. 
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Remark We introduced the notation f~ (H) in Definition 1.1.7. It makes sense even if f 
does not have an inverse function. However, if the inverse function f =! does exist, then 
f (A) is the direct image of the set H C B under f~". 


Composition of Functions 


It often happens that we want to “compose” two functions f, g by first finding f(x) and then 
applying g to get g (f(x)); however, this is possible only when f(x) belongs to the domain of 
g. In order to be able to do this for all f (x), we must assume that the range of fis contained 
in the domain of g. (See Figure 1.1.8.) 


B 
A c 
f 8 
a> > a ~ 
gof 


Figure 1.1.8 The composition of f and g 


1.1.12 Definition Iff: A — B and g : B — C, and if R(f) C D(g) = B, then the 
composite function g o f (note the order!) is the function from A into C defined by 


(gof)(x) :=g(f(x)) forall x EA. 


1.1.13 Examples (a) The order of the composition must be carefully noted. For, let f 
and g be the functions whose values at x € R are given by 


f(x) := 2x and = g(x) := 3x7 — 1. 
Since D(g) = Rand R(f) C R = D(g), then the domain D(g o f) is also equal to R, and the 
composite function g o f is given by 
(g of) (x) = 3(2x)? —1 = 12x? — 1. 
On the other hand, the domain of the composite function f o g is also R, but 
(fog)(x) = 2(3x? — 1) = 6x? — 2. 
Thus, in this case, we have g of # fo g. 


(b) In considering g o f, some care must be exercised to be sure that the range of f is 
contained in the domain of g. For example, if 


f(x :=1-2% and g(x) =x, 
then, since D(g) = {x : x > 0}, the composite function g o f is given by the formula 
(g of)(x) = V1- x? 


only for x € D(f) that satisfy f(x) > 0; that is, for x satisfying —1 < x < 1. 
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We note that if we reverse the order, then the composition fo g is given by the formula 


(fog)(x)=1-x, 


but only for those x in the domain D(g) = {x : x > O}. 


We now give the relationship between composite functions and inverse images. The 
proof is left as an instructive exercise. 


1.1.14 Theorem Let f:A — B and g:B — C be functions and let H be a subset of C. 
Then we have 


(gof) (H) =f ~ (87 (H)). 


Note the reversal in the order of the functions. 


Restrictions of Functions 


If f:A — B is a function and if A; C A, we can define a function fi: A; — B by 
fi (x) := fx) for x EA]. 


The function fı is called the restriction of f to A4. Sometimes it is denoted by f4 = f|A1. 

It may seem strange to the reader that one would ever choose to throw away a part of a 
function, but there are some good reasons for doing so. For example, if f : R — R is the 
squaring function: 


f(x) := x for xeR, 


then fis not injective, so it cannot have an inverse function. However, if we restrict f to the set 
A, := {x : x > 0}, then the restriction f|A; is a bijection of A; onto A,. Therefore, this 
restriction has an inverse function, which is the positive square root function. (Sketch a 
graph.) 

Similarly, the trigonometric functions S(x) := sin x and C(x) := cos x are not injective on 
all of R. However, by making suitable restrictions of these functions, one can obtain the inverse 
sine and the inverse cosine functions that the reader has undoubtedly already encountered. 


Exercises for Section 1.1 


1. Let A := {k :k € N,k < 20}, B := {3k — 1 : k € N}, and C := {2k +1 : k € N}. 
Determine the sets: 
(a) ANBNC, 
(b) (ANB)\C, 
(c) (ANC)\B. 
2. Draw diagrams to simplify and identify the following sets: 
(a) A\(B\A), 
(b) A\(A\B), 
(c) AN (B\A). 
3. IfA and B are sets, show that A C B if and only if A A B =A. 
4. Prove the second De Morgan Law [Theorem 1.1.4(b)]. 
Prove the Distributive Laws: 
(a) AN(BUC) = (ANB)U(ANC), 
(b) AU(BNC)=(AUB)N(AUC). 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 
24. 
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The symmetric difference of two sets A and B is the set D of all elements that belong to either A 
or B but not both. Represent D with a diagram. 

(a) Show that D = (A\B) U (B\A). 

(b) Show that D is also given by D = (A U B)\(A A B). 

For each n € N, let A, = {(n + 1)k : k € N}. 

(a) What is Ay NA2? 

(b) Determine the sets U{A, : n € N} and N{A, : n € N}. 


Draw diagrams in the plane of the Cartesian products A x B for the given sets A and B. 

(a) A={xER:1<x<20r3<x<4},,B={xER:x=l1lox=2}. 

(b) A={1, 2,3}, B={xER:1<x< 3}. 

Let A := B := {x E€ R : —1 < x < 1} and consider the subset C := {(x, y) : x? + y? = 1} of 
A x B. Is this set a function? Explain. 

Let f(x) := 1/x?, x40, x ER. 
(a) Determine the direct image iE) where E:= {x €R: 1<x< 2}. 

(b) Determine the inverse image f~ KG) where G:= {xER:1<x<4}. 


Let g(x) := x? and f(x) := x + 2 for x € R, and let h be the composite function h := g o f. 

(a) Find the direct image A(E) of E := {xER :0<x<1}. 

(b) Find the inverse image h~'(G) of G := {xER:0<x< 4}. 

Let f(x) := x for x € R, and let E := {xER:-1<x<O}andF :={xeR:0<x< ]}. 

Show that EN F = {0} and f( ENF) = {0}, while f(E) =f(F) ={yeR:0<y< 1}. 

Hence f( E N F) is a proper subset of f(E) N f (F). What happens if 0 is deleted from the sets E 

and F? 

Let f and E, F be as in Exercise 12. Find the sets E\F and f(E)\ f (f) and show that it is not true 

that f( E\F) C f(E)\f(F). 

Show that if f : A — B and E, F are subsets of A, then f(EUF) =f(E)Uf(F) and 

F(ENF) CFE) OF (F). 

Show that if f : A —> B and G, H are subsets of B, then f ~! (G U H) = f ~! (G) Uf ~! (H) and 

f (GNA) =f (G) Af “(H). 

Show that the function f defined by f(x) :=x/Vx?+1, x € R, is a bijection of R onto 

{y:-l<y<l}. 

For a, b € R with a < b, find an explicit bijection of A := {x:a < x< b} onto 

B:= {y:0<y< ]}. 

(a) Give an example of two functions f, g on R to R such that fF g, but such that fo g = g o f. 

(b) Give an example of three functions f, g, h on R such that fo(g +h) #Afog+foh. 

(a) Show that if f : A — B is injective and E C A, then f ~'(f(E)) = E. Give an example to 
show that equality need not hold if f is not injective. 

(b) Show that iff: A — B is surjective and H C B, then f(f a (H)) = H. Give an example to 
show that equality need not hold if f is not surjective. 

(a) Suppose that fis an injection. Show thatf~' o f(x) = x forall x € D(f) and that fo f~'(y) = y 


for all y € R(f). 
(b) If fis a bijection of A onto B, show that f—' is a bijection of B onto A. 


Prove that if f: A — B is bijective and g : B — C is bijective, then the composite g o f is a 
bijective map of A onto C. 

Let f: A — B and g : B — C be functions. 

(a) Show that if g o fis injective, then f is injective. 

(b) Show that if g o fis surjective, then g is surjective. 


Prove Theorem 1.1.14. 


Let f, g be functions such that (g o f)(x) = x for all x € D(f) and (fo g)(y) = y for all y € D (8). 
Prove that g =f ~! 
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Section 1.2 Mathematical Induction 


Mathematical Induction is a powerful method of proof that is frequently used to establish 
the validity of statements that are given in terms of the natural numbers. Although its utility 
is restricted to this rather special context, Mathematical Induction is an indispensable tool 
in all branches of mathematics. Since many induction proofs follow the same formal lines 
of argument, we will often state only that a result follows from Mathematical Induction and 
leave it to the reader to provide the necessary details. In this section, we will state the 
principle and give several examples to illustrate how inductive proofs proceed. 
We shall assume familiarity with the set of natural numbers: 


N = {1, 2, 3,...}, 


with the usual arithmetic operations of addition and multiplication, and with the meaning 
of a natural number being less than another one. We will also assume the following 
fundamental property of N. 


1.2.1 Well-Ordering Property of N Every nonempty subset of N has a least element. 


A more detailed statement of this property is as follows: If S is a subset of N and if 
S Æ Ú, then there exists m € S such that m < k for all k € S. 

On the basis of the Well-Ordering Property, we shall derive a version of the Principle 
of Mathematical Induction that is expressed in terms of subsets of N. 


1.2.2 Principle of Mathematical Induction Let S be a subset of N that possesses the 
two properties: 


(1) The number 1 € S. 
(2) For every k € N, if k € S, then k + 1 € 5. 


Then we have S =N. 


Proof. Suppose to the contrary that S 4 N. Then the set N\S is not empty, so by the Well- 
Ordering Principle it has a least element m. Since | € S by hypothesis (1), we know that 
m > 1. But this implies that m — 1 is also a natural number. Since m — 1 < mand since mis 
the least element in N such that m ¢ S, we conclude that m — 1 € S. 

We now apply hypothesis (2) to the element k := m — 1 in S, to infer that k + 1 = 
(m — 1) + 1 = m belongs to S. But this statement contradicts the fact that m ¢ S. Since m 
was obtained from the assumption that N\S is not empty, we have obtained a contradiction. 
Therefore we must have S = N. Q.E.D. 


The Principle of Mathematical Induction is often set forth in the framework of 
statements about natural numbers. If P(n) is a meaningful statement about n € N, then P(n) 
may be true for some values of n and false for others. For example, if P(n) is the statement: 
“n? =n,” then P,(1) is true while P,(7) is false for all n > 1, n € N. On the other hand, if 
P,(n) is the statement: “n? > 1,” then P,(1) is false, while P(n) is true for all n > 1. 

In this context, the Principle of Mathematical Induction can be formulated as follows. 


For each n € N, let P(n) be a statement about n. Suppose that: 


(1^) P(A) is true. 
(2') For every k € N, if P(k) is true, then P(k + 1) is true. 


Then P(n) is true for all n € N. 
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The connection with the preceding version of Mathematical Induction, given in 1.2.2, 
is made by letting S := {n € N : P(n) is true}. Then the conditions (1) and (2) of 1.2.2 
correspond exactly to the conditions (1’) and (2’), respectively. The conclusion that § = N 
in 1.2.2 corresponds to the conclusion that P(n) is true for all n € N. 

In (2') the assumption ‘‘if P(x) is true” is called the induction hypothesis. In 
establishing (2’), we are not concerned with the actual truth or falsity of P(k), but only 
with the validity of the implication “if P(k), then P(k + 1).”’ For example, if we 
consider the statements P(n): “n = n + 5,’ then (2’) is logically correct, for we can 
simply add 1 to both sides of P(k) to obtain P(k + 1). However, since the statement 
P(1): “1 = 6” is false, we cannot use Mathematical Induction to conclude that n = n + 5 
for all n € N. 

It may happen that statements P(n) are false for certain natural numbers but then are 
true for all n > no for some particular nọ. The Principle of Mathematical Induction can be 
modified to deal with this situation. We will formulate the modified principle, but leave its 
verification as an exercise. (See Exercise 12.) 


1.2.3 Principle of Mathematical Induction (second version) Let no € N and let P(n) 
be a statement for each natural number n > no. Suppose that: 


(1) The statement P(no) is true. 
(2) For all k > no, the truth of P(k) implies the truth of P(k + 1). 


Then P(n) is true for all n > no. 


Sometimes the number nọ in (1) is called the base, since it serves as the starting point, 
and the implication in (2), which can be written P(A) = P(k + 1), is called the bridge, 
since it connects the case k to the case k + 1. 

The following examples illustrate how Mathematical Induction is used to prove 
assertions about natural numbers. 


1.2.4 Examples (a) For each n € N, the sum of the first n natural numbers is given by 
1+2+---+n=5n(n+1). 


To prove this formula, we let S be the set of all n € N for which the formula is true. 
We must verify that conditions (1) and (2) of 1.2.2 are satisfied. If n = 1, then we have 
1 = 5-1-(1+4 1) so that 1 € S, and (1) is satisfied. Next, we assume that k € S and wish to 
infer from this assumption that k + 1 € S. Indeed, if k € S, then 


1424---+k=jk(k+ 1). 

If we add k + 1 to both sides of the assumed equality, we obtain 
1+24+---+k+(k+1)=tk(k+1)+(kK+1) 

=$(k+1)(K+2). 


Since this is the stated formula for n = k + 1, we conclude that k + 1 € S. Therefore, 
condition (2) of 1.2.2 is satisfied. Consequently, by the Principle of Mathematical 
Induction, we infer that S = N, so the formula holds for all n € N. 


(b) For each n € N, the sum of the squares of the first n natural numbers is given by 


P+24---4+n =1n(n+1)(2Qn+ 1). 
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To establish this formula, we note that it is true for n = 1, since 1 = l -1-2-3.If we 
assume it is true for k, then adding (k + 1)* to both sides of the assumed formula gives 


2422 4---+h? + (e+ 1) = Lek + 1)(2k +1) + (kK +1) 
=1(k+1)(2k? +k + 6k +6) 
=1(k + 1)(k +2)(2k +3). 


Consequently, the formula is valid for all n € N. 


(c) Given two real numbers a and b, we will prove that a — b is a factor of a” — b” for all 
neéEN. 

First we see that the statement is clearly true for n = 1. If we now assume that a — bisa 
factor of a* — b“, then 


ak+! _ pet} = qkt! a ab* A ab* pe pt! 
=a(a — b*) + b*(a—b). 
By the induction hypothesis, a — b is a factor of a (aX — b*) and it is plainly a factor of 
b*(a — b). Therefore, a — b is a factor of aX*! — b**, and it follows from Mathematical 
Induction that a — b is a factor of a” — b” for all n € N. 
A variety of divisibility results can be derived from this fact. For example, since 
11 — 7 = 4, we see that 11” — 7” is divisible by 4 for all n € N. 


(d) The inequality 2” > 2n + 1 is false for n = 1, 2, but it is true for n = 3. If we assume 
that 2 > 2k + 1, then multiplication by 2 gives, when 2k + 2 > 3, the inequality 


2) > (2k + 1) = 4k +2 = 2k + (2k +2) > 2k +3 =2(k +1) +1. 


Since 2k + 2 > 3 for all k > 1, the bridge is valid for all k > 1 (even though the statement is 
false for k = 1, 2). Hence, with the base nọ = 3, we can apply Mathematical Induction to 
conclude that the inequality holds for all n > 3. 
(e) The inequality 2” < (n + 1)! can be established by Mathematical Induction. 

We first observe that it is true for n = 1, since 2! = 2 = 1 + 1. If we assume that 
2% < (k + 1)!, it follows from the fact that 2 < k + 2 that 


a1 = 2.28 < (k +1)! < (k+ 2)(k +1)! = (k+ 2)! 


Thus, if the inequality holds for k, then it also holds for k + 1. Therefore, Mathematical 
Induction implies that the inequality is true for all n € N. 


© IfreR,r 41, andn €N, then 


k- r”t! 


ltrtP tetra 
l-r 
This is the formula for the sum of the terms in a “geometric progression.” It can 
be established using Mathematical Induction as follows. First, if n = 1, then 1 + r = 
qd — r°) /( — r). If we assume the truth of the formula for n = k and add the term prt 
to both sides, we get (after a little algebra) 
- 1= t ppt Lege 


Dtrtretet 
l-r l-r 


? 


which is the formula for n = k + 1. Therefore, Mathematical Induction implies the validity 
of the formula for all n € N. 
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[This result can also be proved without using Mathematical Induction. If we let 
Sn := ltrtr4+---+r", then rs, =r +r +- + rt, so that 


(1 = F)Sn = Sn — FS = 1 r, 


If we divide by 1 — r, we obtain the stated formula. ] 


(g) Careless use of the Principle of Mathematical Induction can lead to obviously absurd 
conclusions. The reader is invited to find the error in the “proof” of the following assertion. 


Claim: Ifn € N and if the maximum of the natural numbers p and q is n, then p = q. 


“Proof.” Let S be the subset of N for which the claim is true. Evidently, 1 € S since if 
p,q €N and their maximum is 1, then both equal 1 and so p = q. Now assume that k € S and 
that the maximum of p and q is k + 1. Then the maximum of p — 1 and q — 1 is k. But since 
k € S, then p — 1 = q — 1 and therefore p = q. Thus, k + 1 € S, and we conclude that the 
assertion is true for all n € N. 
(h) There are statements that are true for many natural numbers but that are not true for 
all of them. 

For example, the formula p (n) := n? — n + 41 gives a prime number forn = 1,2, ..., 
40. However, p(41) is obviously divisible by 41, so it is not a prime number. 


Another version of the Principle of Mathematical Induction is sometimes quite useful. 
It is called the “Principle of Strong Induction,” even though it is in fact equivalent to 1.2.2. 


1.2.5 Principle of Strong Induction Let S be a subset of N such that 


a^) les. 
(2") For every k €N, if {1, 2, ..., k} CS, then k+ 1 € S. 
Then S = N. 


We will leave it to the reader to establish the equivalence of 1.2.2 and 1.2.5. 


Exercises for Section 1.2 


Prove that 1/1-2+1/2-3+---+1/n(n+1) =n/(n+ 1) for all n € N. 
Prove that 1° +2? +--+ = [n(n 1)? for all n € N. 

Prove that 3+ 11 +--+ + (87 — 5) = 4n? —n for all n € N. 

Prove that 1? +3? +- -- + (2n — 1)? = (4w — n)/3 for all n € N. 

Prove that 1? — 2? +3? +--+ (1) t'n? = (-1)""!n(n + 1)/2 for all n € N. 
Prove that n? + 5n is divisible by 6 for all n € N. 

Prove that 5%” — 1 is divisible by 8 for all n € N. 

Prove that 5” — 4n — 1 is divisible by 16 for all n € N. 

Prove that n? + (n + 1)? + (n + 2)’ is divisible by 9 for all n € N. 


$0} on SA Oe RS a aA 


= 


Conjecture a formula for the sum 1/1 -3 + 1/3 -5+ --- + 1/(2n — 1)(2n + 1), and prove your 
conjecture by using Mathematical Induction. 


= 
=x 


Conjecture a formula for the sum of the first n odd natural numbers 1 + 3 + - -- + (2n — 1), and 
prove your formula by using Mathematical Induction. 


Y” 


Prove the Principle of Mathematical Induction 1.2.3 (second version). 


16 CHAPTER 1 PRELIMINARIES 


13. Prove that n < 2” for all n € N. 

14. Prove that 2” < n! for all n > 4, n € N. 

15. Prove that 2n — 3 < 2”? for all n > 5, n € N. 

16. Find all natural numbers n such that n° < 2”. Prove your assertion. 


17. Find the largest natural number m such that n? — n is divisible by m for all n € N. Prove your 
assertion. 


18. Prove that 1//1+1/V/2+---+1//n> yn for allne N,n > 1. 


19. Let S be a subset of N such that (a) 2% € S for all k € N, and (b) if k € S and k > 2, then 
k — 1 € S. Prove that S = N. 


20. Let the numbers x, be defined as follows: xı := 1, x2 := 2, and Xy4. := 5 (X41 + Xn) for all 
n € N. Use the Principle of Strong Induction (1.2.5) to show that 1 < x, < 2 for all n € N. 


Section 1.3 Finite and Infinite Sets 


When we count the elements in a set, we say “‘one, two, three, .. . ,” stopping when we 
have exhausted the set. From a mathematical perspective, what we are doing is defining a 
bijective mapping between the set and a portion of the set of natural numbers. If the set is 
such that the counting does not terminate, such as the set of natural numbers itself, then we 
describe the set as being infinite. 

The notions of “finite” and “infinite” are extremely primitive, and it is very likely that 
the reader has never examined these notions very carefully. In this section we will define 
these terms precisely and establish a few basic results and state some other important 
results that seem obvious but whose proofs are a bit tricky. These proofs can be found in 
Appendix B and can be read later. 


1.3.1 Definition (a) The empty set Ø is said to have 0 elements. 

(b) If n €N, a set S is said to have n elements if there exists a bijection from the set 
N, := {1, 2,..., n} onto S. 

(c) A set S is said to be finite if it is either empty or it has n elements for some n € N. 

(d) A set S is said to be infinite if it is not finite. 


Since the inverse of a bijection is a bijection, it is easy to see that a set S has n 
elements if and only if there is a bijection from S onto the set {1, 2, ..., n}. Also, 
since the composition of two bijections is a bijection, we see that a set Sı has n 
elements if and only if there is a bijection from Sı onto another set Sj that has n 
elements. Further, a set T, is finite if and only if there is a bijection from 7; onto 
another set T, that is finite. 

It is now necessary to establish some basic properties of finite sets to be sure that the 
definitions do not lead to conclusions that conflict with our experience of counting. From 
the definitions, it is not entirely clear that a finite set might not have n elements for more 
than one value of n. Also it is conceivably possible that the set N := {1, 2, 3,...} might be 
a finite set according to this definition. The reader will be relieved that these possibilities do 
not occur, as the next two theorems state. The proofs of these assertions, which use the 
fundamental properties of N described in Section 1.2, are given in Appendix B. 


1.3.2 Uniqueness Theorem Jf S is a finite set, then the number of elements in S is a 
unique number in N. 
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1.3.3 Theorem The set N of natural numbers is an infinite set. 
The next result gives some elementary properties of finite and infinite sets. 


1.3.4 Theorem (a) IfA is a set with m elements and B is a set with n elements and if 
ANB=0, then AUB has m + n elements. 

(b) IfA isa set with m € N elements and C C A is a set with 1 element, then A\C is a set 
with m — 1 elements. 


(c) If C is an infinite set and B is a finite set, then C\B is an infinite set. 


Proof. (a) Let f be a bijection of N, onto A, and let g be a bijection of N, onto B. We 
define h on Nain by AG) := f@ for i=1,...,m and h(i) := gi — m) for 


i=m+1,...,m-+n. We leave it as an exercise to show that / is a bijection from 
Nin onto A U B. 
The proofs of parts (b) and (c) are left to the reader, see Exercise 2. Q.E.D. 


It may seem “‘obvious” that a subset of a finite set is also finite, but the assertion must 
be deduced from the definitions. This and the corresponding statement for infinite sets are 
established next. 


1.3.5 Theorem Suppose that S and T are sets and that T C S. 


(a) If S is a finite set, then T is a finite set. 
(b) IfT is an infinite set, then S is an infinite set. 


Proof. (a) If T = 9, we already know that T is a finite set. Thus we may suppose that 
T #9). The proof is by induction on the number of elements in S. 

If Shas 1 element, then the only nonempty subset T of S must coincide with S, so Tis a 
finite set. 

Suppose that every nonempty subset of a set with k elements is finite. Now let S be a 
set having k + 1 elements (so there exists a bijection f of N;,; onto S), and let T C S. If 
f(k +1) € T, we can consider T to be a subset of Sı := S\{f(k + 1)}, which has k 
elements by Theorem 1.3.4(b). Hence, by the induction hypothesis, T is a finite set. 

On the other hand, if f( k + 1) € T, then Tı := T\{f( k + 1)} is a subset of S,. Since 
Sı has k elements, the induction hypothesis implies that T; is a finite set. But this implies 
that T = Tı U {f( k + 1)} is also a finite set. 

(b) This assertion is the contrapositive of the assertion in (a). (See Appendix A for a 
discussion of the contrapositive.) Q.E.D. 


Countable Sets 


We now introduce an important type of infinite set. 


1.3.6 Definition (a) A set Sis said to be denumerable (or countably infinite) if there 
exists a bijection of N onto S. 


(b) A set S is said to be countable if it is either finite or denumerable. 
(c) A set S is said to be uncountable if it is not countable. 


From the properties of bijections, it is clear that S is denumerable if and only if there 
exists a bijection of S onto N. Also a set S4 is denumerable if and only if there exists a 
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bijection from S, onto a set Sz that is denumerable. Further, a set T, is countable if and only 
if there exists a bijection from T, onto a set T that is countable. Finally, an infinite 
countable set is denumerable. 


1.3.7 Examples (a) Theset £ := {2n : n € N} of evennatural numbers is denumerable, 
since the mapping f : N > E defined by f(n) := 2n for n € N is a bijection of N onto E. 
Similarly, the set O := {2n — 1 : n € N} of odd natural numbers is denumerable. 
(b) The set Z of all integers is denumerable. 
To construct a bijection of N onto Z, we map | onto 0, we map the set of even natural 
numbers onto the set N of positive integers, and we map the set of odd natural numbers onto 
the negative integers. This mapping can be displayed by the enumeration: 


Z= {0, 1, — 1, 2, — 2,3, —3,...}. 


(c) The union of two disjoint denumerable sets is denumerable. 
Indeed, if A = {a1, ao, a3,...} and B = {b;, bo, b3,...}, we can enumerate the 
elements of A U B as: 


a, bı, a2, bo, 43, Dayus: 


1.3.8 Theorem The set N x N is denumerable. 


Informal Proof. Recall that N x N consists of all ordered pairs (m, n), where m, n € N. 
We can enumerate these pairs as: 


(1,1), (1,2), (2,1, (1,3), (2,2), (68,1), (1, 4),..., 


according to increasing sum m + n, and increasing m. (See Figure 1.3.1.) Q.E.D. 


The enumeration just described is an instance of a “diagonal procedure,” since we 
move along diagonals that each contain finitely many terms as illustrated in Figure 1.3.1. 

The bijection indicated by the diagram can be derived as follows. We first notice that 
the first diagonal has one point, the second diagonal has two points, and so on, with k points 
in the Ath diagonal. Applying the formula in Example 1.2.4(a), we see that the total number 
of points in diagonals 1 through k is given by 


Wk) =14+24+---+k=5k(k+1) 


Figure 1.3.1 The set N x N 
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The point (m, n) lies in the kth diagonal when k = m + n — 1, and it is the mth point in 
that diagonal as we move downward from left to right. (For example, the point (3, 2) lies 
in the 4th diagonal since 3 + 2 — 1 = 4, and it is the 3rd point in that diagonal.) Therefore, 
in the counting scheme displayed by Figure 1.3.1, we count the point (m, n) by first 
counting the points in the first k — 1 = m + n — 2 diagonals and then adding m. Therefore, 
the counting function A : N x N —> N is given by 


A(m, n) :=b(m+n—2)+m 
=$(m+n—2)(m+n—1)+m. 

For example, the point (3, 2) is counted as number /(3, 2) = j. 3-4+3=9, as 
shown by Figure 1.3.1. Similarly, the point (17, 25) is counted as number / (17, 25) = (40) 
+ 17 = 837. 

This geometric argument leading to the counting formula has been suggestive and 
convincing, but it remains to be proved that / is, in fact, a bijection of N x N onto N. 
A detailed proof is given in Appendix B. 

The construction of an explicit bijection between sets is often complicated. The next 
two results are useful in establishing the countability of sets, since they do not involve 
showing that certain mappings are bijections. The first result may seem intuitively clear, 
but its proof is rather technical; it will be given in Appendix B. 


1.3.9 Theorem Suppose that S and T are sets and that T C S. 


(a) Jf S is a countable set, then T is a countable set. 
(b) If T is an uncountable set, then S is an uncountable set. 


1.3.10 Theorem The following statements are equivalent: 


(a) S is a countable set. 
(b) There exists a surjection of N onto S. 
(c) There exists an injection of S into N. 


Proof. (a) => (b) If Sis finite, there exists a bijection A of some set N, onto S and we 
define H on N by 
_ Sh(k) for k=1,..., 2, 
A g P for k>n. 

Then H is a surjection of N onto S. 

If S is denumerable, there exists a bijection H of N onto S, which is also a surjection of 
N onto S. 
(b) > (c) If His a surjection of N onto S, we define H, : S — N by letting H, (s) be the 
least element in the set H7! (s) := {n € N : H(n) = s}. To see that H; is an injection of S 
into N, note that if s, £ € S and ny := Hı (s) = Hı (t), then s = H(ns) = t. 
(c)= (a) If H, is an injection of S into N, then it is a bijection of S onto Hı (S) C N. By 
Theorem 1.3.9(a), Hı(S) is countable, whence the set S is countable. Q.E.D. 


1.3.11 Theorem The set Q of all rational numbers is denumerable. 


Proof. The idea of the proof is to observe that the set Q* of positive rational numbers is 
contained in the enumeration: 
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which is another ‘“‘diagonal mapping” (see Figure 1.3.2). However, this mapping is not an 
injection, since the different fractions 5 and 2 represent the same rational number. 


. Anie Oh} RNR enoe 
-ela wla nje NS 


-Ajn 


Figure 1.3.2 The set Q7 


To proceed more formally, note that since N x N is countable (by Theorem 1.3.8), it 
follows from Theorem 1.3.10(b) that there exists a surjection f of N onto N x N. If g : 
N x N — Q" is the mapping that sends the ordered pair (m, n) into the rational number 
having a representation m/n, then g is a surjection onto Q*. Therefore, the composition g o f 
is a surjection of N onto Q*, and Theorem 1.3.10 implies that Q* is a countable set. 

Similarly, the set Q` of all negative rational numbers is countable. It follows as in 
Example 1.3.7(b) that the set Q = Q7 U {0} U Q” is countable. Since Q contains N, it 
must be a denumerable set. Q.E.D. 


The next result is concerned with unions of sets. In view of Theorem 1.3.10, we need 
not be worried about possible overlapping of the sets. Also, we do not have to construct a 
bijection. 


1.3.12 Theorem /fA,, is a countable set for each m € N, then the union A := (>, Am 
is countable. 


Proof. Foreachm € N, let g,, be a surjection of N onto Am. We define 6 : N x N > A by 
Bim, n) = Pm(n)- 


We claim that £ is a surjection. Indeed, if a € A, then there exists a least m € N such that 
a € Am, whence there exists a least n € N such that a = g,,(n). Therefore, a = £ (m, n). 

Since N x Nis countable, it follows from Theorem 1.3.10 that there exists a surjection 
f :N— Nx N whence £ o fis a surjection of N onto A. Now apply Theorem 1.3.10 again 
to conclude that A is countable. Q.E.D. 


Remark A less formal (but more intuitive) way to see the truth of Theorem 1.3.12 is to 
enumerate the elements of Am, m € N, as: 


A = {au, 412, 413, -- $ 
Ag = {a21, an, az, ...}, 
A3 = {a31, 432, 433, ..-}, 
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We then enumerate this array using the “diagonal procedure”: 
411, 412, 421, 413, 422, 431, 414,---, 


as was displayed in Figure 1.3.1. 


Georg Cantor 

Georg Cantor (1845-1918) was born in St. Petersburg, Russia. His father, 
a Danish businessman working in Russia, moved the family to Germany 
several years later. Cantor studied briefly at Zurich, then went to the 
University of Berlin, the best in mathematics at the time. He received his 
doctorate in 1869, and accepted a position at the University of Halle, 
where he worked alone on his research, but would occasionally travel the 
seventy miles to Berlin to visit colleagues. 

Cantor is known as the founder of modern set theory and he was the first to study the 
concept of infinite set in rigorous detail. In 1874 he proved that Q is countable and, in 
contrast, that R is uncountable (see Section 2.5), exhibiting two kinds of infinity. In a series of 
papers he developed a general theory of infinite sets, including some surprising results. In 
1877 he proved that the two-dimensional unit square in the plane could be put into one-one 
correspondence with the unit interval on the line, a result he sent in a letter to his colleague 
Richard Dedekind in Berlin, writing “I see it, but I do not believe it.” Cantor’s Theorem on 
sets of subsets shows there are many different orders of infinity and this led him to create a 
theory of “‘transfinite’? numbers that he published in 1895 and 1897. His work generated 
considerable controversy among mathematicians of that era, but in 1904, London’s Royal 
Society awarded Cantor the Sylvester Medal, its highest honor. 

Beginning in 1884, he suffered from episodes of depression that increased in severity as the 
years passed. He was hospitalized several times for nervous breakdowns in the Halle Nervenklinik 
and spent the last seven months of his life there. 


We close this section with one of Cantor’s more remarkable theorems. 


1.3.13 Cantor’s Theorem [fA is any set, then there is no surjection of A onto the set 
P(A) of all subsets of A. 


Proof. Suppose that gy : A > P(A) is a surjection. Since g(a) is a subset of A, either a 
belongs to g(a) or it does not belong to this set. We let 


D := {a €A :a¢ gla)}. 


Since D is a subset of A, if ọ is a surjection, then D = (ao) for some ao € A. 

We must have either ao € D or dy ¢ D. If ao € D, then since D = (ao), we must have 
ao € (ao), contrary to the definition of D. Similarly, if ag € D, then ao ¢ (ao) so that 
ao € D, which is also a contradiction. 

Therefore, g cannot be a surjection. Q.E.D. 


Cantor’s Theorem implies that there is an unending progression of larger and larger 
sets. In particular, it implies that the collection P(N) of all subsets of the natural numbers 
N is uncountable. 
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Exercises for Section 1.3 


SSS oe 


10. 


11. 


12. 
13. 


Prove that a nonempty set T; is finite if and only if there is a bijection from T; onto a finite set T>. 
Prove parts (b) and (c) of Theorem 1.3.4. 


Let S := {1, 2} and T := {a, b, c}. 
(a) Determine the number of different injections from S into T. 
(b) Determine the number of different surjections from T onto S. 


Exhibit a bijection between N and the set of all odd integers greater than 13. 
Give an explicit definition of the bijection f from N onto Z described in Example 1.3.7(b). 
Exhibit a bijection between N and a proper subset of itself. 


Prove that a set T, is denumerable if and only if there is a bijection from T, onto a denumerable 
set T>. 


Give an example of a countable collection of finite sets whose union is not finite. 
Prove in detail that if S and T are denumerable, then SU T is denumerable. 


(a) If (m, n) is the 6th point down the 9th diagonal of the array in Figure 1.3.1, calculate its 
number according to the counting method given for Theorem 1.3.8. 

(b) Given that h(m, 3) = 19, find m. 

Determine the number of elements in P(S), the collection of all subsets of S, for each of the 

following sets: 

(a) S:= (1, 2}, 

(b) S:= {1, 2, 3}, 

(c) S:= {1, 2, 3, 4}. 

Be sure to include the empty set and the set S itself in P(S). 

Use Mathematical Induction to prove that if the set S has n elements, then P(S) has 2” elements. 


Prove that the collection F(N) of all finite subsets of N is countable. 


CHAPTER 2 


THE REAL NUMBERS 


In this chapter we will discuss the essential properties of the real number system R. 
Although it is possible to give a formal construction of this system on the basis of a more 
primitive set (such as the set N of natural numbers or the set Q of rational numbers), we 
have chosen not to do so. Instead, we exhibit a list of fundamental properties associated 
with the real numbers and show how further properties can be deduced from them. This 
kind of activity is much more useful in learning the tools of analysis than examining the 
logical difficulties of constructing a model for R. 

The real number system can be described as a “complete ordered field,” and we will 
discuss that description in considerable detail. In Section 2.1, we first introduce the 
“algebraic” properties—often called the “‘field’’ properties in abstract algebra—that are 
based on the two operations of addition and multiplication. We continue the section with 
the introduction of the “order” properties of R and we derive some consequences of these 
properties and illustrate their use in working with inequalities. The notion of absolute 
value, which is based on the order properties, is discussed in Section 2.2. 

In Section 2.3, we make the final step by adding the crucial “completeness” property 
to the algebraic and order properties of R. It is this property, which was not fully 
understood until the late nineteenth century, that underlies the theory of limits and 
continuity and essentially all that follows in this book. The rigorous development of 
real analysis would not be possible without this essential property. 

In Section 2.4, we apply the Completeness Property to derive several fundamental 
results concerning R, including the Archimedean Property, the existence of square roots, 
and the density of rational numbers in R. We establish, in Section 2.5, the Nested Interval 
Property and use it to prove the uncountability of R. We also discuss its relation to binary 
and decimal representations of real numbers. 

Part of the purpose of Sections 2.1 and 2.2 is to provide examples of proofs of 
elementary theorems from explicitly stated assumptions. Students can thus gain experience 
in writing formal proofs before encountering the more subtle and complicated arguments 
related to the Completeness Property and its consequences. However, students who have 
previously studied the axiomatic method and the technique of proofs (perhaps in a course 
on abstract algebra) can move to Section 2.3 after a cursory look at the earlier sections. A 
brief discussion of logic and types of proofs can be found in Appendix A at the back of the 
book. Terms such as “‘contrapositive” and “converse” are explained there and several 
proofs are examined in detail. 


Section 2.1 The Algebraic and Order Properties of R 


We begin with a brief discussion of the “algebraic structure” of the real number system. 
We will give a short list of basic properties of addition and multiplication from which all 
other algebraic properties can be derived as theorems. In the terminology of abstract 
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algebra, the system of real numbers is a “‘field”’ with respect to addition and multiplication. 
The basic properties listed in 2.1.1 are known as the field axioms. A binary operation 
associates with each pair (a, b) a unique element B(a, b), but we will use the conventional 
notations of a + b and a- b when discussing the properties of addition and multiplication. 


2.1.1 Algebraic Properties of R On the set R of real numbers there are two binary 
operations, denoted by + and - and called addition and multiplication, respectively. These 
operations satisfy the following properties: 


(Al) a+b =b +a for all a, b in R (commutative property of addition); 

(A2) (a+b)+c=a+(b+c) for all a, b, c in R (associative property of addition), 

(A3) there exists an element 0 in R such that 0 + a = a and a + 0 = a for all a in R 
(existence of a zero element); 


(A4) for each a in R there exists an element —a in R such that a + (—a) = 0 and (—a) + 
a = 0 (existence of negative elements); 

(M1) a-b=5-a for all a, b in R (commutative property of multiplication); 

(M2) (a-b)-c =a- (b-c) for all a, b, c in R (associative property of multiplication), 

(M3) there exists an element 1 in R distinct from 0 such that 1 - a = aanda- 1 = a for all 
a in R (existence of a unit element); 

(M4) for each a Æ 0 in R there exists an element 1/a in R such that a - (1/a) = 1 and 
(1/a) -a = 1 (existence of reciprocals); 

D) a-(b+c) = (a-b)+(a-c)and(b+c)-a=(b-a)+(c-a) for all a, b, cin R 
(distributive property of multiplication over addition). 


These properties should be familiar to the reader. The first four are concerned with 
addition, the next four with multiplication, and the last one connects the two operations. 
The point of the list is that all the familiar techniques of algebra can be derived from these 
nine properties, in much the same spirit that the theorems of Euclidean geometry can be 
deduced from the five basic axioms stated by Euclid in his Elements. Since this task more 
properly belongs to a course in abstract algebra, we will not carry it out here. However, to 
exhibit the spirit of the endeavor, we will sample a few results and their proofs. 

We first establish the basic fact that the elements O and 1, whose existence were 
asserted in (A3) and (M3), are in fact unique. We also show that multiplication by 0 always 
results in 0. 


2.1.2 Theorem (a) /f z and a are elements in R with z+ a= a, thenz=0. 
(b) [fu and b 4 0 are elements in R with u- b = b, then u= 1. 
(c) IfaER, thena-0=0. 
Proof. (a) Using (A3), (A4), (A2), the hypothesis z + a = a, and (A4), we get 
z=z+0=24 (a+ (-a)) = (z+ a) + (-a) =a + (—a) =0. 
(b) Using (M3), (M4), (M2), the assumed equality u - b = b, and (M4) again, we get 
u=u-l=u-(b-(1/b)) = (u-b)- (1/b) =b. (1/b) = 1. 
(c) We have (why?) 
ata:-0=a-1+a-0=a-(14+0)=a-l=a. 


Therefore, we conclude from (a) that a-0 = 0. Q.E.D. 
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We next establish two important properties of multiplication: the uniqueness of reciprocals 
and the fact that a product of two numbers is zero only when one of the factors is zero. 


2.1.3 Theorem (a) Ifa #0 and b in Rare such that a- b = 1, then b = 1/a. 
(b) Ifa-b=0, then either a = 0 or b = 0. 


Proof. (a) Using (M3), (M4), (M2), the hypothesis a - b = 1, and (M3), we have 


b=1-b=((1/a)-a)-b = (1/a)- (a-b) = (1/a)- 1 = 1/a. 


(b) It suffices to assume a Æ 0 and prove that b = 0. (Why?) We multiply a - b by 1/a and 
apply (M2), (M4), and (M3) to get 


(1/a)- (a-b) = ((1/a)-a)-b=1-b=b. 
Since a- b = 0, by 2.1.2(c) this also equals 


(1/a)- (a-b) = (1/a)-0=0. 
Thus we have b = 0. QED. 


These theorems represent a small sample of the algebraic properties of the real number 
system. Some additional consequences of the field properties are given in the exercises. 

The operation of subtraction is defined by a — b := a + (—b) for a, b in R. Similarly, 
division is defined for a, b in R with b Æ 0 by a/b := a - (1/8). In the following, we will 
use this customary notation for subtraction and division, and we will use all the familiar 
properties of these operations. We will ordinarily drop the use of the dot to indicate 
multiplication and write ab for a- b. Similarly, we will use the usual notation for exponents 
and write a? for aa, a° for (a*)a; and, in general, we define a’! := (a")a for n € N. We 
agree to adopt the convention that a! = a. Further, if a 4 0, we write a? = 1 and a7! for 
1/a, and if n € N, we will write a~” for (1/a)", when it is convenient to do so. In general, 
we will freely apply all the usual techniques of algebra without further elaboration. 


Rational and Irrational Numbers 


We regard the set N of natural numbers as a subset of R, by identifying the natural number 
n € N with the n-fold sum of the unit element 1 € R. Similarly, we identify 0 € Z with the 
zero element of 0 € R, and we identify the n-fold sum of —1 with the integer —n. Thus, we 
consider N and Z to be subsets of R. 

Elements of R that can be written in the form b/a where a, b € Zand a Æ 0 are called 
rational numbers. The set of all rational numbers in R will be denoted by the standard 
notation Q. The sum and product of two rational numbers is again a rational number (prove 
this), and moreover, the field properties listed at the beginning of this section can be shown 
to hold for Q. 

The fact that there are elements in R that are not in Q is not immediately apparent. In 
the sixth century B.c. the ancient Greek society of Pythagoreans discovered that the 
diagonal of a square with unit sides could not be expressed as a ratio of integers. In view of 
the Pythagorean Theorem for right triangles, this implies that the square of no rational 
number can equal 2. This discovery had a profound impact on the development of Greek 
mathematics. One consequence is that elements of R that are not in Q became known as 
irrational numbers, meaning that they are not ratios of integers. Although the word 
“irrational” in modern English usage has a quite different meaning, we shall adopt the 
standard mathematical usage of this term. 
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We will now prove that there does not exist a rational number whose square is 2. In the 
proof we use the notions of even and odd numbers. Recall that a natural number is even if it 
has the form 2n for some n € N, and it is odd if it has the form 2n — 1 for some n € N. 
Every natural number is either even or odd, and no natural number is both even and odd. 


2.1.4 Theorem There does not exist a rational number r such that r? = 2. 


Proof. Suppose, on the contrary, that p and q are integers such that (p/ q? = 2. We may 
assume that p and q are positive and have no common integer factors other than 1. (Why?) 
Since p? = 2q?, we see that p* is even. This implies that p is also even (because if p = 
2n — 1 is odd, then its square p? = 2(2n? — 2n + 1) — 1 is also odd). Therefore, since p and 
q do not have 2 as a common factor, then q must be an odd natural number. 

Since p is even, then p = 2m for some m € N, and hence 4m? = 24°, so that 2m? = q?. 
Therefore, q? is even, and it follows that q is an even natural number. 

Since the hypothesis that (p/q)? = 2 leads to the contradictory conclusion that q is 
both even and odd, it must be false. Q.E.D. 


The Order Properties of R 


The “order properties” of R refer to the notions of positivity and inequalities between real 
numbers. As with the algebraic structure of the system of real numbers, we proceed by 
isolating three basic properties from which all other order properties and calculations with 
inequalities can be deduced. The simplest way to do this is to identify a special subset of R 
by using the notion of “positivity.” 


2.1.5 The Order Properties of R There is a nonempty subset P of R, called the set of 
positive real numbers, that satisfies the following properties: 


(i) Ifa, b belong to P, then a + b belongs to P. 
(ii) If a, b belong to P, then ab belongs to P. 
(iii) If a belongs to R, then exactly one of the following holds: 


aeéP, a=0, —aeP. 


The first two conditions ensure the compatibility of order with the operations of 
addition and multiplication, respectively. Condition 2.1.5(iii) is usually called the 
Trichotomy Property, since it divides R into three distinct types of elements. It states 
that the set {—a : a € P} of negative real numbers has no elements in common with the set 
P of positive real numbers, and, moreover, the set R is the union of three disjoint sets. 

Ifa € P, we write a > 0 and say that ais a positive (or a strictly positive) real number. 
Ifa € PU {0}, we write a > 0 and say that a is a nonnegative real number. Similarly, if 
—a € P, we write a < 0 and say that a is a negative (or a strictly negative) real number. 
If —a € PU {0}, we write a < 0 and say that a is a nonpositive real number. 

The notion of inequality between two real numbers will now be defined in terms of the 
set P of positive elements. 


2.1.6 Definition Let a, b be elements of R. 


(a) If a—b €P, then we write a > b or b < a. 
(b) Ifa-— b € PU {0}, then we write a > b or b < a. 
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The Trichotomy Property 2.1.5(iii) implies that for a,b € R exactly one of the 
following will hold: 


a>b, a=b, a<b. 
Therefore, if both a < b and b < a, then a = b. 
For notational convenience, we will write 
a<b<c 


to mean that both a< b and b < c are satisfied. The other “double” inequalities 
a<b<c,a<b<c,anda< b< care defined in a similar manner. 


To illustrate how the basic Order Properties are used to derive the “rules of inequalities,” we 
will now establish several results that the reader has used in earlier mathematics courses. 


2.1.7 Theorem Let a, b, c be any elements of R. 


(a) Ifa >bandb >c, thena>c. 

(b) Ifa>b, thna+c>b+e. 

(c) Ifa>bandc > 0, then ca > cb. 
Ifa>bandc < 0, then ca < cb. 


Proof. (a) If a—b €P and b—c€P, then 2.1.5(i) implies that (a — b) + (b — c) = 
a — c belongs to P. Hence a > c. 
(b) If a— b € P, then (a+c)—(b+c)=a-—bisin P. Thus a+c > b+c. 
(c) If a— b €P and c € P, then ca — cb = c(a — b) is in P by 2.1.5Gi). Thus ca > cb 
when c > 0. 

On the other hand, if c < 0, then —c € P, so that cb — ca = (—c)(a — b) is in P. Thus 
cb > ca when c < 0. Q.E.D. 


It is natural to expect that the natural numbers are positive real numbers. This property 
is derived from the basic properties of order. The key observation is that the square of any 
nonzero real number is positive. 


2.1.8 Theorem 

(a) Ifae R and a £0, then a > 0. 
(b) 1>0. 

(c) Ifn EN, thenn>0. 


Proof. (a) By the Trichotomy Property, if a 4 0, then either a € P or —a € P. If a € P, 
then by 2.1.5(ii), we have a” = a - a € P. Also, if —a € P, then a? = (—a)(—a) € P. We 
conclude that if a Æ 0, then a>o. 

(b) Since 1 = 1’, it follows from (a) that 1 > 0. 

(c) We use Mathematical Induction. The assertion for n = 1 is true by (b). If we suppose the 
assertion is true for the natural number k, then k € P, and since 1 € P, we have k + 1 € P by 
2.1.5(i). Therefore, the assertion is true for all natural numbers. Q.E.D. 


It is worth noting that no smallest positive real number can exist. This follows by 
observing that if a > 0, then since 5 > 0 (why?), we have that 


0<sa<a. 
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Thus if it is claimed that a is the smallest positive real number, we can exhibit a smaller 
positive number 4a. 

This observation leads to the next result, which will be used frequently as a method of 
proof. For instance, to prove that a number a > 0 is actually equal to zero, we see that it 
suffices to show that a is smaller than an arbitrary positive number. 


2.1.9 Theorem Jfa €R is such that O < a < e for every ¢ > 0, then a = 0. 


Proof. Suppose to the contrary that a > 0. Then if we take £ọ := $a, we have 0 < £o < a. 
Therefore, it is false that a < e for every ¢ > 0 and we conclude that a = 0. Q.E.D. 


Remark Itis an exercise to show that if a € R is such that 0 < a < e for every e > 0, 
then a = 0. 


The product of two positive numbers is positive. However, the positivity of a product 
of two numbers does not imply that each factor is positive. The correct conclusion is given 
in the next theorem. It is an important tool in working with inequalities. 


2.1.10 Theorem /f ab > 0, then either 


(i) a>Oandb>O0, or 
Gi) a<Oandb<0. 


Proof. First we note that ab > 0 implies that a 4 0 and b Æ 0. (Why?) From the Trichotomy 
Property, either a > 0 or a < 0. If a > 0, then 1/a > 0, and therefore b = (1/a) (ab) > 0. 
Similarly, if a < 0, then 1/a < 0, so that b = (1/a)(ab) < 0. Q.E.D. 
2.1.11 Corollary Zfab < 0, then either 


(i) a<Oandb>O0,or 
Gi) a>Oandb <0. 


Inequalities 


We now show how the Order Properties presented in this section can be used to “solve” 
certain inequalities. The reader should justify each of the steps. 


2.1.12 Examples (a) Determine the set A of all real numbers x such that 2x +3 < 6. 
We note that we have! 


xEA 4> 2x+3<6 4> 2X53 4> x<i. 
Therefore A = {x ER:x< 3h. 
(b) Determine the set B := {x E€ R : x? +x > 2}. 
We rewrite the inequality so that Theorem 2.1.10 can be applied. Note that 
xEB => x +x-2>0 4 (x-1)(x4+2)>0. 


Therefore, we either have (i) x — 1 > 0 and x + 2 > 0, or we have (ii) x — 1 < 0 and 
x +2 < 0. In case (i) we must have both x > 1 and x > —2, which is satisfied if and only 


The symbol <=> should be read “if and only if.” 
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if x > 1. In case (ii) we must have both x < | and x < —2, which is satisfied if and only 
if x < —2. 

We conclude that B= {x ER: x >1}U{xeE R:x< -2}. 
(c) Determine the set 


We note that 


Therefore we have either (i) x — 1 < 0 and x +2 > 0, or Gi) x— 1 > 0 and x+2 <0. 
(Why?) In case (i) we must have both x < 1 and x > —2, which is satisfied if and only if 
—2 < x < 1. Incase (ii), we must have both x > 1 and x < —2, which is never satisfied. 

We conclude that C= {xE R:-2<x< 1}. 


The following examples illustrate the use of the Order Properties of R in establishing 
certain inequalities. The reader should verify the steps in the arguments by identifying the 
properties that are employed. 

It should be noted that the existence of square roots of positive numbers has not yet 
been established; however, we assume the existence of these roots for the purpose of these 
examples. (The existence of square roots will be discussed in Section 2.4.) 


2.1.13 Examples (a) Let a > 0 and b > 0. Then 
(1) a<b 4 @< bP 4 Va<vb 


We consider the case where a > 0 and b > 0, leaving the case a = 0 to the reader. It follows 
from 2.1.5(i) that a + b > 0. Since b? — a? = (b — a) (b + a), it follows from 2.1.7(c) that 
b — a > 0 implies that b? — a? > 0. Also, it follows from 2.1.10 that b? — a? > 0 implies 
that b — a > 0. 

If a > 0 and b > 0, then ya > 0 and vb > 0. Since a = (va? and b = (VB), the 
second implication is a consequence of the first one when a and b are replaced by y/a and 
Vb, respectively. 

We also leave it to the reader to show that if a > 0 and b > 0, then 


a^) a<b 4> <b = Va<vb 


(b) Ifa and b are positive real numbers, then their arithmetic mean is 5 (a + b) and their 
geometric mean is vab. The Arithmetic-Geometric Mean Inequality for a, b is 


(2) vab <4}(a+b) 
with equality occurring if and only if a = b. 

To prove this, note that if a>0,b>0, and a Æ b, then ya > 0, Vb > 0, and 
Ja # Vb. (Why?) Therefore it follows from 2.1.8(a) that (va = vb)" > 0. Expanding 
this square, we obtain 


a—2Vab+b>0, 


whence it follows that 


vab < (a+b). 
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Therefore (2) holds (with strict inequality) when a 4 b. Moreover, if a = b(> 0), then both 
sides of (2) equal a, so (2) becomes an equality. This proves that (2) holds fora > 0, b > 0. 


On the other hand, suppose that a>0,b>0 and that vab = 5(a+b). Then, 
squaring both sides and multiplying by 4, we obtain 
4ab = (a+b) =a? + 2ab+ Pb’, 
whence it follows that 
0 = @ —2ab+ b? = (a— by’. 
But this equality implies that a = b. (Why?) Thus, equality in (2) implies that a = b. 


Remark The general Arithmetic-Geometric Mean Inequality for the positive real 
numbers 41, d2,..., dy iS 


(3) Get a A 


n 


with equality occurring if and only if aj = az = --- = a). It is possible to prove this more 
general statement using Mathematical Induction, but the proof is somewhat intricate. A more 
elegant proof that uses properties of the exponential function is indicated in Exercise 8.3.9 in 
Chapter 8. 


(c) Bernoulli’s Inequality. If x > —1, then 
(4) (+x) >1+nx forall neN 


The proof uses Mathematical Induction. The case n = 1 yields equality, so the assertion 
is valid in this case. Next, we assume the validity of the inequality (4) for k € N and will 
deduce it for k + 1. Indeed, the assumptions that (1 + x)* >1+kx and that l+x>0 
imply (why?) that 
k+1 L x)* . (1 +x) 
+kx). (1 +x) = 1+ (k+ 1)x + kx? 

(k+ 1)x. 


(+x) 


Thus, inequality (4) holds for n = k + 1. Therefore, (4) holds for all n € N. 


Exercises for Section 2.1 


1. Ifa,b € R, prove the following. 


(a) Ifa+b=0, then b= —a, (b) —(—a) =a, 
(c) (—l)a=-—a, (d) (-I)(-l=1 
2. Prove that if a,b € R, then 
(a) (a+ b) = (—a) + (-5), (b) (—a)-(—b) =a-b5, 
© 1/(-a) = -(1/a), (d —(a/b) = (—a)/b if b £0. 
3. Solve the following equations, justifying each step by referring to an appropriate property or 
theorem. 
(a) 2x+5=8, (b) x? = 2x, 


(c) x? -1=3, (d) (x—1)(x4+2) =0. 


10. 


11. 


12. 


13. 
14. 


15. 


16. 


17. 


18. 
19. 


20. 


21. 


22. 


23. 


24. 


25; 
26. 
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If a € R satisfies a - a = a, prove that either a = 0 or a = 1. 

If a#0 and b £ 0, show that 1/(ab) = (1/a)(1/b). 

Use the argument in the proof of Theorem 2.1.4 to show that there does not exist a rational 
number s such that s* = 6. 

Modity the proof of Theorem 2.1.4 to show that there does not exist a rational number ¢ such that 
t= 3: 


(a) Show that if x, y are rational numbers, then x + y and xy are rational numbers. 
(b) Prove that if x is a rational number and y is an irrational number, then x + y is an irrational 
number. If, in addition, x Æ 0, then show that xy is an irrational number. 


Let K := {s +t/2:5,t€ Q}. Show that K satisfies the following: 

(a) If x1,x2 € K, then xı + x2 E€ K and xix E K. 

(b) Ifx#0 and x€ K, then 1/x € K. 

(Thus the set K is a subfield of R. With the order inherited from R, the set K is an ordered field 
that lies between Q and R.) 


(a) Ifa< b and c< d, prove that a+c <b+d. 
(b) IfO<a< b and0 < c< d, prove that 0 < ac < bd. 


(a) Show that if a > 0, then 1/a > 0 and 1/(1/a) = a. 
(b) Show that if a < b, then a < 4 (a + b) <b. 


Let a, b, c, d be numbers satisfying 0 < a < b and c < d < 0. Give an example where ac < bd, 
and one where bd < ac. 


If a,b € R, show that & + b° = 0 if and only if a = 0 and b = 0. 

If 0<a<b, show that a? < ab < b?. Show by example that it does not follow that 
a <ab< b. 

If 0 < a< b, show that (a) a < vab < b, and (b) 1/b < 1/a. 


Find all real numbers x that satisfy the following inequalities. 
(a) x2 >3x+4, b) 1<x <4, 
(c) 1/x<x, (d) 1/x< x. 


Prove the following form of Theorem 2.1.9: Ifa € R is such that 0 < a < e for every ¢ > 0, then 
a=0. 

Let a,b € R, and suppose that for every ¢ > 0 we have a < b + e. Show that a < b. 

Prove that [$ (a + 5)| we $(a + b°) for all a,b € R. Show that equality holds if and only if 


a=b. 
(a) If0<c <1, show that 0< <c<1. 
(b) If1< c, show that 1 < c< œ. 


(a) Prove there is no n € N such that 0 < n < 1. (Use the Well-Ordering Property of N.) 
(b) Prove that no natural number can be both even and odd. 


(a) Ifc > 1, show that c” > c for all n € N, and that c’ > c for n > 1. 
(b) IfO< c< 1, show that ¢” < c for all n € N, and that c” < c for n > 1. 


If a > 0,b > 0, and n € N, show that a < b if and only if a” < b”. [Hint: Use Mathematical 
Induction. ] 


(a) Ifc > 1 and m,n €N, show that c” > c” if and only if m > n. 
(b) IfO<c< 1 and m,n €N, show that c” < c” if and only if m > n. 


Assuming the existence of roots, show that if c > 1, then c!” < c!/” if and only if m > n. 


Use Mathematical Induction to show that if a€ R and m,n € N, then a”™” =a"a" and 
(a’") = M aa 
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Section 2.2 Absolute Value and the Real Line 


From the Trichotomy Property 2.1.5(iii), we are assured that if a € R and a £0, then 
exactly one of the numbers a and —q is positive. The absolute value of a 4 0 is defined to 
be the positive one of these two numbers. The absolute value of 0 is defined to be 0. 


2.2.1 Definition The absolute value of a real number a, denoted by |a 


, is defined by 


a ifa>O, 
laj:= 40 ifa=O, 
—a ifa<0 


For example, |5| = 5 and | — 8| = 8. We see from the definition that |a| > 0 for all 
a€ R, and that |a| = 0 if and only if a= 0. Also | —a| = |a| for all a € R. Some 
additional properties are as follows. 


2.2.2 Theorem (a) |ab| = |a||b| for all a,b € R. 
b) ja? = æ for alla € R. 

(c) If c > 0, then |a| < c if and only if —c <a < c. 
(d) -la| <a < jaļ| for alla € R. 


Proof. (a) If either a or b is 0, then both sides are equal to 0. There are four other cases to 
consider. If a > 0,b > 0, then ab > 0, so that |ab| = ab = |al|b|. If a > 0,b < 0, then 


ab < 0, so that |ab| = —ab = a(—b) = |a||b|. The remaining cases are treated similarly. 
2 2] 


2 
|aa| = |alla| = jal". 


(b) Since a* > 0, we have a la 


(c) If|a| < c, then we have botha < cand —a < c(why?), whichis equivalent to—c < a < c. 
Conversely, if —c < a < c, then we have both a < c and —a < c (why?), so that |a| < c. 
(d) Take c = |a| in part (c). ani 


The following important inequality will be used frequently. 
2.2.3 Triangle Inequality Zf a,b € R, then |a+b| < |a| + |b]. 
Proof. From 2.2.2(d), we have —|a| < a < |a| and —|b| < b < |b|. On adding these 
inequalities, we obtain 
—(la| + |b|) < a+b < lal + |b]. 
Hence, by 2.2.2(c) we have |a + b| < |a| + |b]. Q.E.D. 
It can be shown that equality occurs in the Triangle Inequality if and only if ab > 0, 
which is equivalent to saying that a and b have the same sign. (See Exercise 2.) 
There are many useful variations of the Triangle Inequality. Here are two. 
2.2.4 Corollary Zfa,b €R, then 
(a) |la| — lbl| < la — bl, 
(b) |a— b| < |a| + |b]. 


Proof. (a) We write a=a—b+b and then apply the Triangle Inequality to get 
ja| = |(a — b) + b| < |a — b| + |b|. Now subtract |b| to get |a| — |b| < |a — b|. Similarly, 
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from |b| = |b — a + a| < |b—a| + |a|, we obtain —|a — b| = -|b — al < |a| — |b|. If we 
combine these two inequalities, using 2.2.2(c), we get the inequality in (a). 

(b) Replace b in the Triangle Inequality by —b to get |a — b| < |a| + | — b|. Since | — b| = 
|b| we obtain the inequality in (b). QED. 


A straightforward application of Mathematical Induction extends the Triangle In- 
equality to any finite number of elements of R. 


2.2.5 Corollary If a), d2,...,@, are any real numbers, then 
Jar +a +++: + an| < Jaa] + la| +--+ lanl. 
The following examples illustrate how the properties of absolute value can be used. 


2.2.6 Examples (a) Determine the set A of x € R such that |2x + 3| < 7. 

From a modification of 2.2.2(c) for the case of strict inequality, we see that x € A if 
and only if —7 < 2x + 3 < 7, which is satisfied if and only if —10 < 2x < 4. Dividing by 
2, we conclude that A = {x E R: —5 < x < 2}. 

(b) Determine the set B := {x € R: |x — 1| < |x|}. 

One method is to consider cases so that the absolute value symbols can be removed. 

Here we take the cases 

(i)x>1, (i)O<x<1, (iü)x< 0. 

(Why did we choose these three cases?) In case (i) the inequality becomes x — | < x, 
which is satisfied without further restriction. Therefore all x such that x > 1 belong to the 
set B. In case (ii), the inequality becomes —(x — 1) < x, which requires that x > L, Thus, 
this case contributes all x such that 5 < x< 1 to the set B. In case (iii), the inequality 
becomes —(x — 1) < —x, which is equivalent to 1 < 0. Since this statement is false, no 
value of x from case (iii) satisfies the inequality. Forming the union of the three cases, we 
conclude that B = {x ER:x> ah. 

There is a second method of determining the set B based on the fact that a < b if and 
only if & < b? when both a > 0 and b > 0. (See 2.1.13(a).) Thus, the inequality |x — 1| < 
|x| is equivalent to the inequality |x — 1|? < |x|’. Since |a|? = a? for any a by 2.2.2(b), we 
can expand the square to obtain x? — 2x + 1 < x’, which simplifies to x > 5. Thus, we 
again find that B = {x ER:x> ae This method of squaring can sometimes be used to 
advantage, but often a case analysis cannot be avoided when dealing with absolute values. 

A graphical view of the inequality is obtained by sketching the graphs of y = |x| and 
y= |x — 1|, and interpreting the inequality |x — 1| < |x| to mean that the graph of y = 
|x — 1| lies underneath the graph of y = |x|. See Figure 2.2.1. 


AY 


Figure 2.2.1 |x — 1| < |x| 
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(c) Solve the inequality |2x — 1| < x+ 1. 

There are two cases to consider. If x < 5, then |2x — 1| = —2x + 1 and the inequality 
becomes —2x + 1 < x + 1, which is x > 0. Thus case one gives us 0 < x < 5. For case two 
we assume x > Z, which gives us the inequality 2x — 1 < x + 1, or x < 2. Since x > 2, we 
obtain 5 < x < 2. Combining the two cases, we get 0 < x < 2. See Figure 2.2.2. 


AY 


Figure 2.2.2 |2x—1|<x+1 


(d) Let the function f be defined by f(x) := (2x? + 3x + 1)/(2x — 1) for 2 < x <3. 
Find a constant M such that | f(x)| < M for all x satisfying 2 < x < 3. 


We consider separately the numerator and denominator of 


(2x? + 3x41] 


From the Triangle Inequality, we obtain 
[2x2 + 3x +1] < |x|? + 3|x]+1<2-37+3-34+1=28 


since |x| < 3 for the x under consideration. Also, |2x — 1] > 2|x| -1>2-2-1=3 
since |x| > 2 for the x under consideration. Thus, 1/|2x — 1| < 1/3 for x > 2. (Why?) 
Therefore, for 2 < x < 3 we have |f(x)| < 28/3. Hence we can take M = 28/3. (Note 
that we have found one such constant M; evidently any number H > 28/3 will also 
satisfy | f(x)| < H. It is also possible that 28/3 is not the smallest possible choice 
for M.) 


The Real Line 


A convenient and familiar geometric interpretation of the real number system is the real 
line. In this interpretation, the absolute value |a| of an element a in R is regarded as the 
distance from a to the origin 0. More generally, the distance between elements a and b in R 
is |a — b|. (See Figure 2.2.3.) 


k— |-2) - (3)| = 5 ——> 


Figure 2.2.3 The distance between a = —2 and b = 3 
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Later we will need precise language to discuss the notion of one real number being 
“close to” another. If a is a given real number, then saying that a real number x is “close 
to” ashould mean that the distance |x — a| between them is “small.” A context in which 
this idea can be discussed is provided by the terminology of neighborhoods, which we 
now define. 


2.2.7 Definition Let a€ R and ¢>0. Then the ¢-neighborhood of a is the set 
V.(a) :={x ER: |x-a| <e}. 


For a € R, the statement that x belongs to V,(a) is equivalent to either of the 
statements (see Figure 2.2.4) 


-—E<X-a<E SS a—E<x<a+e. 


a OO, aan iS 


a-e a ate 


Figure 2.2.4 An ¢-neighborhood of a 


2.2.8 Theorem Leta € R. If x belongs to the neighborhood V,(a) for every e > 0, then 
x= 4. 


Proof. If a particular x satisfies |x — a| < e for every € > 0, then it follows from 2.1.9 that 
|x — a| = 0, and hence x = a. QED. 


2.2.9 Examples (a) Let U := {x:0 < x < 1}. Ifa € U, then let ¢ be the smaller of the 
two numbers a and 1 — a. Then it is an exercise to show that V,(a) is contained in U. Thus 
each element of U has some -neighborhood of it contained in U. 

(b) IfI := {x:0<x< 1}, then for any € > 0, the -neighborhood V,(0) of 0 contains 
points not in J, and so V,,(0) is not contained in J. For example, the number x, := —é/2 is in 
V.(0) but not in Z. 


(c) If |x —al| < «and |y — b| < e, then the Triangle Inequality implies that 


(x—a)+(y—5)| 


I(x+y)—(a+5)| = | 
< |x-—al+ly—)| < 2e. 


Thus if x, y belong to the -neighborhoods of a, b, respectively, then x + y belongs to the 
2e-neighborhood of a + b (but not necessarily to the -neighborhood of a + b). 


Exercises for Section 2.2 


1. Ifa,b €R and b £0, show that: 
(a) la| = væ, (bo) |a/b| = |aļ/|bl. 


2. Ifa,b € R, show that |a + b| = |a| + |b| if and only if ab > 0. 


If x,y,z € Rand x < z, show that x < y < z if and only if |x — y| + |y — z| = |x — z|. Interpret 
this geometrically. 
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4. Show that |x— a| < e if and only if a— e< x<a+e. 
Ifa< x< banda < y < b, show that |x — y| < b — a. Interpret this geometrically. 


6. Find all x € R that satisfy the following inequalities: 
(a) |4x—5| < 13, (b) |x? - 1] <3. 


7. Find all x € R that satisfy the equation |x + 1| + |x — 2| =7. 
. Find all values of x that satisfy the following equations: 
(a) x+1=|2x- 1, (b) 2x-1=|x—5|. 


9. Find all values of x that satisfy the following inequalities. Sketch graphs. 
(a) |x-2|<x+1, (b)  3|x| <<2-—~. 


10. Find all x € R that satisfy the following inequalities. 
(a) |x-1]> |x4+1], (b) |x| +|x+1| <2. 


11. Sketch the graph of the equation y = |x| — |x — 1]. 
12. Find all x € R that satisfy the inequality 4 < |x + 2| + |x— 1| <5. 


13. Find all x € R that satisfy both |2x — 3| < 5 and |x + 1| > 2 simultaneously. 
14. Determine and sketch the set of pairs (x,y) in R x R that satisfy: 


@ |x| = ll, (b) [x] + yl = 1, 
© |xy| = 2, d) |x|- by] =2. 
15. Determine and sketch the set of pairs (x, y) in R x R that satisfy: 
@ |x| < [yl (b) |x| +l] <1, 
(c) |xy| < 2, (d) |x| — |y| > 2. 


16. Let ¢ > 0 and ô > 0, anda E€ R. Show that V,(a)M V5(a) and V,(a) U V3(a) are y-neighbor- 
hoods of a for appropriate values of y. 

17. Show that if a,b € R, and a ¥ b, then there exist -neighborhoods U of a and V of b such that 
UnNV=9. 

18. Show that if a,b € R then 
(a) max{a,b} =5(a+b+|a—b|) and min{a,b} = $ (a + b — ja — JJ). 
(b) min{a,b,c} = min{min{a, b}, c}. 

19. Show that if a,b,c € R, then the “middle number” is mid{a,b,c}= min{max{a,b}, 
max{b, c}, max{c,a}}. 


Section 2.3 The Completeness Property of R 


Thus far, we have discussed the algebraic properties and the order properties of the real number 
system R. In this section we shall present one more property of R that is often called the 
“Completeness Property.” The system Q of rational numbers also has the algebraic and order 
properties described in the preceding sections, but we have seen that v2 cannot be represented 
as a rational number; therefore \/2 does not belong to Q. This observation shows the necessity 
of an additional property to characterize the real number system. This additional property, the 
Completeness (or the Supremum) Property, is an essential property of R, and we will say that R 
is a complete ordered field. It is this special property that permits us to define and develop the 
various limiting procedures that will be discussed in the chapters that follow. 

There are several different ways to describe the Completeness Property. We choose to 
give what is probably the most efficient approach by assuming that each nonempty 
bounded subset of IR has a supremum. 
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Suprema and Infima 


We now introduce the notions of upper bound and lower bound for a set of real numbers. 
These ideas will be of utmost importance in later sections. 


2.3.1 Definition Let S be a nonempty subset of R. 


(a) The set S is said to be bounded above if there exists a number u € R such that s < u 
for all s € S. Each such number u is called an upper bound of S. 

(b) The set S is said to be bounded below if there exists a number w € R such that w < s 
for all s € S. Each such number w is called a lower bound of S. 

(c) A set is said to be bounded if it is both bounded above and bounded below. A set is 
said to be unbounded if it is not bounded. 


For example, the set S := {x € R : x < 2} is bounded above; the number 2 and any 
number larger than 2 is an upper bound of S. This set has no lower bounds, so that the set is 
not bounded below. Thus it is unbounded (even though it is bounded above). 

If a set has one upper bound, then it has infinitely many upper bounds, because if uis an 
upper bound of S, then the numbers u + 1,u+ 2,... are also upper bounds of S. (A similar 
observation is valid for lower bounds.) 

In the set of upper bounds of S and the set of lower bounds of S, we single out their least 
and greatest elements, respectively, for special attention in the following definition. (See 
Figure 2.3.1.) 


lower bounds of S upper bounds of S 
Figure 2.3.1 inf S and sup S 


2.3.2 Definition Let S be a nonempty subset of R. 


(a) If S is bounded above, then a number u is said to be a supremum (or a least upper 
bound) of S if it satisfies the conditions: 


(1) wis an upper bound of S, and 
(2) if v is any upper bound of S, then u < v. 

(b) If Sis bounded below, then a number w is said to be an infimum (or a greatest lower 
bound) of S if it satisfies the conditions: 


(1') w is a lower bound of S, and 
(2') if t is any lower bound of S, then t < w. 


It is not difficult to see that there can be only one supremum of a given subset S of R. 
(Then we can refer to the supremum of a set instead of a supremum.) For, suppose that u; 
and u are both suprema of S. If u; < u, then the hypothesis that wv is a supremum implies 
that u; cannot be an upper bound of S. Similarly, we see that u2 < u1 is not possible. 
Therefore, we must have u; = u2. A similar argument can be given to show that the 
infimum of a set is uniquely determined. 

If the supremum or the infimum of a set S exists, we will denote them by 


sup S and inf S. 
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We also observe that if w is an arbitrary upper bound of a nonempty set S, then sup S < w. 
This is because sup S is the least of the upper bounds of S. 

First of all, it needs to be emphasized that in order for a nonempty set S in R to have a 
supremum, it must have an upper bound. Thus, not every subset of R has a supremum; 
similarly, not every subset of R has an infimum. Indeed, there are four possibilities for a 
nonempty subset S of R: it can 


(i) have both a supremum and an infimum, 
(ii) have a supremum but no infimum, 

(ii) have an infimum but no supremum, 

(iv) have neither a supremum nor an infimum. 


We also wish to stress that in order to show that u = sup S for some nonempty subset S 
of R, we need to show that both (1) and (2) of Definition 2.3.2(a) hold. It will be instructive 
to reformulate these statements. 

The definition of u = sup S asserts that u is an upper bound of S such that u < v for any 
upper bound v of S. It is useful to have alternative ways of expressing the idea that u is the 
“least” of the upper bounds of S. One way is to observe that any number smaller than u is 
not an upper bound of S. That is, if z < u, then z is not an upper bound of S. But to say that z 
is not an upper bound of S means there exists an element s- in S such that z < s+. Similarly, 
if e > 0, then u — e is smaller than u and thus fails to be an upper bound of S. 

The following statements about an upper bound u of a set S are equivalent: 


(1) if v is any upper bound of S, then u < v, 

(2) if z < u, then z is not an upper bound of S, 

(3) if z < u, then there exists s- € S such that z < s+, 

(4) if € > 0, then there exists s € S such that u — € < Sg. 


Therefore, we can state two alternate formulations for the supremum. 


2.3.3 Lemma A number u is the supremum of a nonempty subset S of R if and only if u 
satisfies the conditions: 


(A) s<uforallsés, 
(2) ifv <u, then there exists 5 € S such that v < s'. 


For future work with limits, it is useful to have this condition expressed in terms of 
e > 0. This is done in the next lemma. 


2.3.4 Lemma An upper bound u of a nonempty set S in R is the supremum of S if and 
only if for every € > 0 there exists an S, E€ S such that u — € < S. 


Proof. If uis an upper bound of S that satisfies the stated condition and if v < u, then we 
put e := u — v. Then e > 0, so there exists s, € S such that v = u — € < s. Therefore, v is 
not an upper bound of S, and we conclude that u = sup S. 

Conversely, suppose that u = sup S and let e > 0. Since u — ¢ < u, then u — e is not an 
upper bound of S. Therefore, some element s, of S must be greater than u — e; that is, 
u — € < S. (See Figure 2.3.2.) Q.E.D. 


It is important to realize that the supremum of a set may or may not be an element of 
the set. Sometimes it is and sometimes it is not, depending on the particular set. We 
consider a few examples. 
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Figure 2.3.2 u=supS 


2.3.5 Examples (a) Ifanonempty set S; has a finite number of elements, then it can be 
shown that S; has a largest element u and a least element w. Then u = sup S, and w = inf S;, 
and they are both members of S4. (This is clear if Sı has only one element, and it can be 
proved by induction on the number of elements in S4; see Exercises 12 and 13.) 

(b) The set S, := {x : 0 < x < 1} clearly has 1 for an upper bound. We prove that 1 is its 
supremum as follows. If v < 1, there exists an element s’ € S2 such that v < s’. (Name one 
such element s’.) Therefore v is not an upper bound of S, and, since v is an arbitrary number 
v < 1, we conclude that sup S2 = 1. It is similarly shown that inf S2 = 0. Note that both the 
supremum and the infimum of Sz are contained in S2. 

(c) The set S3 := {x:0 < x< 1} clearly has 1 for an upper bound. Using the same 
argument as given in (b), we see that sup $3 = 1. In this case, the set S3 does not contain its 
supremum. Similarly, inf $3; = 0 is not contained in $3. 


The Completeness Property of R 


It is not possible to prove on the basis of the field and order properties of R that were 
discussed in Section 2.1 that every nonempty subset of R that is bounded above has a 
supremum in R. However, it is a deep and fundamental property of the real number system 
that this is indeed the case. We will make frequent and essential use of this property, 
especially in our discussion of limiting processes. The following statement concerning the 
existence of suprema is our final assumption about R. Thus, we say that R is a complete 
ordered field. 


2.3.6 The Completeness Property of R Every nonempty set of real numbers that has 
an upper bound also has a supremum in R. 


This property is also called the Supremum Property of R. The analogous property for 
infima can be deduced from the Completeness Property as follows. Suppose that S is a 
nonempty subset of R that is bounded below. Then the nonempty set S := {—s: s € S} is 
bounded above, and the Supremum Property implies that u := sup S exists in R. The reader 
should verify in detail that —u is the infimum of S. 


Exercises for Section 2.3 


1. Let Sı := {x € R : x > 0}. Show in detail that the set Sı has lower bounds, but no upper 
bounds. Show that inf Sı = 0. 


2. Let S2 := {x € R : x > 0}. Does Sz have lower bounds? Does Sz have upper bounds? Does 
inf S2 exist? Does sup Sz exist? Prove your statements. 

3. Let S3 = {1/n:ne€ N). Show that sup S3 = 1 and inf $3 > 0. (It will follow from the 
Archimedean Property in Section 2.4 that inf S3 = 0.) 


4. Let Sy:= {1 —(-1)"/n: n € N}. Find inf S4 and sup S4. 
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5. Find the infimum and supremum, if they exist, of each of the following sets. 
(a) A:={xeER: 2x+5> 0}, b) B:={xER:x+2> x}, 
(c) C:={xER:x<1/x}, (d) D:={x R: x? —2x 5<0}. 


Let S be a nonempty subset of R that is bounded below. Prove that inf S$ = —sup{—s : s € S}. 
7. IfasetS C R contains one of its upper bounds, show that this upper bound is the supremum of S. 
8. Let S C R be nonempty. Show that u € R is an upper bound of S if and only if the conditions 
t € R and ¢ > u imply that t ¢ S. 


9. Let SCR be nonempty. Show that if u = sup S, then for every number n € N the number 
u— 1/n is not an upper bound of S, but the number u + 1/n is an upper bound of S. (The 
converse is also true; see Exercise 2.4.3.) 


10. Show that if A and B are bounded subsets of R, then A UB is a bounded set. Show that 
sup(A U B) = sup{sup A, sup B}. 

11. Let S be a bounded set in R and let Sọ be a nonempty subset of S. Show that 
inf S < inf So < sup So < sup S. 

12. Let SCR and suppose that s*:=supS belongs to S. If u¢ S, show that 
sup(S U {u}) = sup{s*, u}. 

13. Show that a nonempty finite set $ C R contains its supremum. [Hint: Use Mathematical 


Induction and the preceding exercise.] 


14. Let Sbe a set that is bounded below. Prove that a lower bound w of S is the infimum of S if and 
only if for any ¢ > 0 there exists tf € § such that £ < w + e. 


Section 2.4 Applications of the Supremum Property 


We will now discuss how to work with suprema and infima. We will also give some very 
important applications of these concepts to derive fundamental properties of R. We begin 
with examples that illustrate useful techniques in applying the ideas of supremum and 
infimum. 


2.4.1 Examples (a) It is an important fact that taking suprema and infima of sets is 
compatible with the algebraic properties of R. As an example, we present here the 
compatibility of taking suprema and addition. 

Let S be a nonempty subset of R that is bounded above, and let a be any number in R. 
Define the set a+ S:= {a + s : s € S}. We will prove that 


sup(a+ S) = a + sup S. 


If we let u := sup S, then x < u for all x € S, so that a + x < a + u. Therefore, a + u 
is an upper bound for the set a + S ; consequently, we have sup(a + S) < a + u. 

Now if v is any upper bound of the set a+ S, then a+ x< v for all x € S. 
Consequently x < v — a for all x € S, so that v — a is an upper bound of S. Therefore, 
u = sup S < v — a, which gives us a + u < v. Since v is any upper bound of a + S, we can 
replace v by sup(a + S) to get a +u < sup(a + S). 

Combining these inequalities, we conclude that 


sup(a + S) = a + u = a + sup S. 


For similar relationships between the suprema and infima of sets and the operations of 
addition and multiplication, see the exercises. 
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(b) If the suprema or infima of two sets are involved, it is often necessary to establish 
results in two stages, working with one set at a time. Here is an example. 
Suppose that A and B are nonempty subsets of R that satisfy the property: 


a<b forallac Aandallbe€ B. 
We will prove that 
supA < inf B. 
For, given b € B, we have a < b for all a € A. This means that b is an upper bound of A, so 


that sup A < b. Next, since the last inequality holds for all b € B, we see that the number 
sup A is a lower bound for the set B. Therefore, we conclude that sup A < inf B. 


Functions 


The idea of upper bound and lower bound is applied to functions by considering the range 
of a function. Given a function f : D — R, we say that f is bounded above if the set 
f(D) = {f(x) : x € D} is bounded above in R; that is, there exists B € R such that f(x) < 
B for all x € D. Similarly, the function fis bounded below if the set f(D) is bounded below. 
We say that fis bounded if it is bounded above and below; this is equivalent to saying that 
there exists B € R such that | f(x)| < B for all x € D. 

The following example illustrates how to work with suprema and infima of functions. 


2.4.2 Examples Suppose that f and g are real-valued functions with common domain 
D C R. We assume that f and g are bounded. 


(a) Iff(x) < g(x) for all x € D, then sup f(D) < sup g(D), which is sometimes written: 
sup f(x) < sup g(x). 
xED xED 


We first note that f(x) < g(x) < sup g(D), which implies that the number sup g(D) is 
an upper bound for f(D). Therefore, sup f(D) < sup g(D). 

(b) We note that the hypothesis f(x) < g(x) for all x € D in part (a) does not imply any 
relation between sup f(D) and inf g(D). 

For example, if f(x) := x? and g(x) := x with D = {x : 0 < x < 1}, then f(x) < g(x) 
for all x € D. However, we see that sup f(D) = 1 and inf g(D) = 0. Since sup g(D) = 1, the 
conclusion of (a) holds. 

(c) Iff(x) < g(y) for all x, y € D, then we may conclude that sup f(D) < infg(D), which 
we may write as: 

supy (x) < inf g0). 
(Note that the functions in (b) do not satisfy this hypothesis.) 

The proof proceeds in two stages as in Example 2.4.1(b). The reader should write out 
the details of the argument. 


Further relationships between suprema and infima of functions are given in the exercises. 


The Archimedean Property 


Because of your familiarity with the set R and the customary picture of the real line, it may 
seem obvious that the set N of natural numbers is not bounded in R. How can we prove this 
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“obvious” fact? In fact, we cannot do so by using only the Algebraic and Order Properties 
given in Section 2.1. Indeed, we must use the Completeness Property of R as well as the 
Inductive Property of N (that is, if n € N, then n+ 1 € N). 

The absence of upper bounds for N means that given any real number x there exists a 
natural number n (depending on x) such that x < n. 


2.4.3 Archimedean Property /f x € R, then there exists ny € N such that x < nx. 


Proof. If the assertion is false, then n < x for all n € N; therefore, x is an upper bound 
of N. Therefore, by the Completeness Property, the nonempty set N has a supremum u € R. 
Subtracting 1 from u gives a number u — 1, which is smaller than the supremum u of N. 
Therefore u — 1 is not an upper bound of N, so there exists m € N with u — 1 < m. Adding 
1 gives u < m + 1, and since m + 1 € N, this inequality contradicts the fact that u is an 
upper bound of N. Q.E.D. 


2.4.4 Corollary Zf S:= {1/n : n € N}, then inf S = 0. 


Proof. Since S +Æ () is bounded below by 0, it has an infimum and we let w := inf S. It is 
clear that w > 0. For any € > 0, the Archimedean Property implies that there exists n € N 
such that 1/e < n, which implies 1/n < e. Therefore we have 


O0<w<l/n<e. 


But since ¢ > 0 is arbitrary, it follows from Theorem 2.1.9 that w = 0. Q.E.D. 


2.4.5 Corollary Jf t > 0, there exists n, € N such that O < 1/n; < t. 

Proof. Since inf {1/n : n € N} = 0 and ¢ > 0, then ¢ is not a lower bound for the set 
{1/n : n € N}. Thus there exists n; € N such that 0 < 1/m < t. Q.E.D. 
2.4.6 Corollary If y > 0, there exists n, € N such that n, — 1 < y < n. 

Proof. The Archimedean Property ensures that the subset E, := {m E€ N : y < m} of 
N is not empty. By the Well-Ordering Property 1.2.1, E, has a least element, which 
we denote by n,. Then n,—1 does not belong to Fy, and hence we have 


n, —=1<y<n. Q.E.D. 


Collectively, the Corollaries 2.4.4—2.4.6 are sometimes referred to as the Archimedean 
Property of R. 


The Existence of v2 


The importance of the Supremum Property lies in the fact that it guarantees the existence of 
real numbers under certain hypotheses. We shall make use of it in this way many times. At 
the moment, we shall illustrate this use by proving the existence of a positive real number x 
such that x? = 2; that is, the positive square root of 2. It was shown earlier (see Theorem 
2.1.4) that such an x cannot be a rational number; thus, we will be deriving the existence of 
at least one irrational number. 
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2.4.7 Theorem There exists a positive real number x such that x? = 2. 


Proof. Let S:= {s ER:0<5s, 3< 21 Since 1 € S, the set is not empty. Also, S is 
bounded above by 2, because if t > 2, then 7? > 4 so that t ¢ S. Therefore the Supremum 
Property implies that the set S has a supremum in R, and we let x := sup S. Note that x > 1. 
We will prove that x? = 2 by ruling out the other two possibilities: x? < 2 and x? > 2. 
First assume that x? < 2. We will show that this assumption contradicts the fact that 
x = sup S by finding an n € N such that x + 1/n € S, thus implying that x is not an upper 
bound for $. To see how to choose n, note that 1/n? < 1/n so that 


1\? 2x 1 1 
(=+) SpA e xX +-(2x+ 1). 
n n n n 
Hence if we can choose n so that 
1 
-(2x+1)<2- x, 
n 
then we get (x + 1/n}? < x? + (2 — x?) = 2. By assumption we have 2 — x? > 0, so that 
(2 — x*)/(2x + 1) > 0. Hence the Archimedean Property (Corollary 2.4.5) can be used to 
obtain n € N such that 
1 2 2-x 
n` 2x41 


These steps can be reversed to show that for this choice of n we have x + 1/n € S, which 
contradicts the fact that x is an upper bound of S. Therefore we cannot have x? < 2. 

Now assume that x? > 2. We will show that it is then possible to find m € N such that 
x — 1/mis also an upper bound of S, contradicting the fact that x = sup S. To do this, note 
that 


Hence if we can choose m so that 


2x 
Be BP 
m 


then (x — 1/m)? > x? — (x? — 2) = 2. Now by assumption we have x? — 2 > 0, so that 
(x? — 2)/2x > 0. Hence, by the Archimedean Property, there exists m € N such that 


1 »x-2 
m 2x ` 


These steps can be reversed to show that for this choice of m we have (x — 1/ m}? > 2. 
Now if s€S, then s?<2<(x—1/m)*, whence it follows from 2.1.13(a) that 
s < x— l/m. This implies that x—1/m is an upper bound for S, which contradicts the 
fact that x = sup S. Therefore we cannot have x? > 2. 

Since the possibilities x? < 2 and x? > 2 have been excluded, we must have 
x? = 2. QED. 


By slightly modifying the preceding argument, the reader can show that if a > 0, then 
there is a unique b > 0 such that b° = a. We call b the positive square root of a and denote 
it by b= yaorb = a'/?. A slightly more complicated argument involving the binomial 
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theorem can be formulated to establish the existence of a unique positive nth root of a, 
denoted by v/a or a!/”, for each n € N. 


Remark If in the proof of Theorem 2.4.7 we replace the set S by the set of rational 
numbers T := {r € Q: 0< r, r? < 2}, the argument then gives the conclusion that y := 
sup T satisfies y? = 2. Since we have seen in Theorem 2.1.4 that y cannot be a rational 
number, it follows that the set T that consists of rational numbers does not have a supremum 
belonging to the set Q. Thus the ordered field Q of rational numbers does not possess the 
Completeness Property. 


Density of Rational Numbers in R 


We now know that there exists at least one irrational real number, namely \/2. Actually there 
are “more” irrational numbers than rational numbers in the sense that the set of rational 
numbers is countable (as shown in Section 1.3), while the set of irrational numbers is 
uncountable (see Section 2.5). However, we next show that in spite of this apparent disparity, 
the set of rational numbers is “dense” in R in the sense that given any two real numbers there 
is a rational number between them (in fact, there are infinitely many such rational numbers). 


2.4.8 The Density Theorem Jf x and y are any real numbers with x < y, then there 
exists a rational number r € Q such that x <r < y. 


Proof. Itis no loss of generality (why?) to assume that x > 0. Since y — x > 0, it follows 
from Corollary 2.4.5 that there exists n € N such that 1/n < y — x. Therefore, we have 
nx+1<ny. If we apply Corollary 2.4.6 to nx >0, we obtain meN with 
m—1<nx < m. Therefore, m < nx + 1 < ny, whence nx < m < ny. Thus, the rational 
number r := m/n satisfies x <r < y. QED. 


To round out the discussion of the interlacing of rational and irrational numbers, we 
have the same “‘betweenness property” for the set of irrational numbers. 


2.4.9 Corollary [fx and y are real numbers with x < y, then there exists an irrational 
number z such that x < z < y. 


Proof. If we apply the Density Theorem 2.4.8 to the real numbers x/ V2 and y / V2, we 
obtain a rational number r 4 0 (why?) such that 


x 
ee 


v2 v2 


Then z := rv2 is irrational (why?) and satisfies x < z < y. Q.E.D. 


Exercises for Section 2.4 


1. Show that sup{1 — 1/n:ne N} = 1. 
2. If S:= {1/n—1/m:n,m €N}, find inf S and sup S. 


3. Let S C R be nonempty. Prove that if a number u in R has the properties: (i) for every n € N the 
number u — 1/n is not an upper bound of S, and (ii) for every number n € N the number u + 1/n 
is an upper bound of S, then u = sup S. (This is the converse of Exercise 2.3.9.) 


10. 


11. 
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Let S be a nonempty bounded set in R. 
(a) Let a > 0, and let aS := {as : s € S}. Prove that 


inf(aS) = ainf S, sup(aS) = a sup S. 
(b) Let 6 < 0 and let bS = {bs : s € S}. Prove that 
inf(bS) = bsup S, sup(bS) = b inf S. 
Let S be a set of nonnegative real numbers that is bounded above and let T := {x? : x € S}. 


Prove that if u = sup S, then u? = sup T. Give an example that shows the conclusion may be 
false if the restriction against negative numbers is removed. 


Let X be a nonempty set and let f : X — R have bounded range in R. If a € R, show that 
Example 2.4.1(a) implies that 


sup{a + f(x): x E€ X} = a + sup{ f(x): x € X}. 


Show that we also have 


inf{a + f(x): x € X} =a + inf{ f(x): x €X}. 


Let A and B be bounded nonempty subsets of R, and let A + B := {a + b : a € A, b € B}. Prove 
that sup(A + B) = sup A + sup B and inf(A + B) = inf A + inf B. 


Let X be a nonempty set, and let fand g be defined on X and have bounded ranges in R. Show that 
sup{ f(x) + g(x): x E€ X} < sup{ f(x): x E€ X} + sup{g(x) : x € X} 


and that 

inf{ f(x) : x € X} + inf{g(x) : x € X} < inf{ f(x) + g(x): x € X}. 
Give examples to show that each of these inequalities can be either equalities or strict 
inequalities. 


Let X = Y := {xE R:0 < x< 1}. Define A : X x YR by h(x, y) := 2x + y. 

(a) For each x € X, find f(x) := sup{h(x, y) : y € Y}; then find inf{ f(x) : x € X}. 

(b) For each y € Y, find g(y) := inf{h(x, y) : x € X}; then find sup{g(y) : y € Y}. Compare 
with the result found in part (a). 


Perform the computations in (a) and (b) of the preceding exercise for the function A : X x Y > R 
defined by 


_ JO ifx<y, 
nis.) = {4 ifx >y. 


Let X and Y be nonempty sets and let A : X x Y — R have bounded range in R. Let f : X — R 
and g : Y — R be defined by 
f(x) = sup{h(x,y):y € Y}, gO) = inf{h(x,y) : x € X}. 
Prove that 
sup{g(y): y E€ Y} < inf{ f(x): x € X}. 
We sometimes express this by writing 


sup inf h(x,y) < inf suph(x,y). 
y x x y 


Note that Exercises 9 and 10 show that the inequality may be either an equality or a strict 
inequality. 
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12. Let X and Y be nonempty sets and let h : X x Y — R have bounded range in R. Let F : X — R 
and G : Y — R be defined by 


F(x) := sup{h(x,y):y E€ Y}, G(y) := sup{A(x,y) : x € X}. 
Establish the Principle of the Iterated Suprema: 
sup{h(x,y) : x E€ X,y € Y} = sup{F (x) : x E€ X} = sup{G(y):y E€ Y} 
We sometimes express this in symbols by 


sup h(x, y) = sup sup h(x, y) = sup sup h(x, y). 
x,y x y y x 
13. Given any x € R, show that there exists a unique n € Z such that n— 1 < x <n. 
14. If y > 0, show that there exists n € N such that 1/2” < y. 


15. Modify the argument in Theorem 2.4.7 to show that there exists a positive real number y such 
that y? = 3. 


16. Modify the argument in Theorem 2.4.7 to show that if a > 0, then there exists a positive real 
number z such that z? = a. 


17. Modify the argument in Theorem 2.4.7 to show that there exists a positive real number u such 
mati = 2. 


18. Complete the proof of the Density Theorem 2.4.8 by removing the assumption that x > 0. 


19. If u >O is any real number and x < y, show that there exists a rational number r such that 
x < ru < y. (Hence the set {ru: r € Q} is dense in R.) 


Section 2.5 Intervals 


The Order Relation on R determines a natural collection of subsets called ‘‘intervals.”’ 
The notations and terminology for these special sets will be familiar from earlier 
courses. If a,b € R satisfy a < b, then the open interval determined by a and b is 
the set 


(a,b) := {xe R:a<x< }d}. 


The points a and b are called the endpoints of the interval; however, the endpoints are not 
included in an open interval. If both endpoints are adjoined to this open interval, then we 
obtain the closed interval determined by a and b; namely, the set 


a,b] := {xE R:a<x< bd}. 


The two half-open (or half-closed) intervals determined by a and b are [a, b), which 
includes the endpoint a, and (a, b], which includes the endpoint b. 

Each of these four intervals is bounded and has length defined by b — a. If a = b, the 
corresponding open interval is the empty set (a, a) = 0, whereas the corresponding closed 
interval is the singleton set [a, a] = {a}. 

There are five types of unbounded intervals for which the symbols oo(or + co) and —oo 
are used as notational convenience in place of the endpoints. The infinite open intervals are 
the sets of the form 


(a,oo):={xER:x>a} and (-~w,b):= {xe R:x< b}. 


2.5. INTERVALS 47 


The first set has no upper bounds and the second one has no lower bounds. Adjoining 
endpoints gives us the infinite closed intervals: 


[a,oo) := {x ER:a<x} and (-o0,b] := {x eR: x < dD}. 


It is often convenient to think of the entire set R as an infinite interval; in this case, we write 
(—oo, co) := R. No point is an endpoint of (—oo, 00). 


Warning It must be emphasized that co and —oo are not elements of R, but only 
convenient symbols. 


Characterization of Intervals 


An obvious property of intervals is that if two points x, y with x < y belong to an interval J, 
then any point lying between them also belongs to Z. That is, if x < t < y, then the point t 
belongs to the same interval as x and y. In other words, if x and y belong to an interval Z, 
then the interval [x, y] is contained in J. We now show that a subset of R possessing this 
property must be an interval. 


2.5.1 Characterization Theorem JfS is a subset of R that contains at least two points 
and has the property 


(1) if x,y€S and x<y, then [x,y] CS, 


then S is an interval. 


Proof. There are four cases to consider: (i) S is bounded, (ii) S is bounded above but not 
below, (iii) S is bounded below but not above, and (iv) S is neither bounded above nor 
below. 

Case (i): Let a:=infS and b := sup S. Then SC [a,b] and we will show that 
(a,b) CS. 

Ifa < z < b, then zis not a lower bound of S, so there exists x € S with x < z. Also, z 
is not an upper bound of S, so there exists y € S with z < y. Therefore z € |x, y], so property 
(1) implies that z € S. Since z is an arbitrary element of (a, b), we conclude that (a, b) C S. 

Now if a € S and b € S, then S = [a, b]. (Why?) If a ¢ S and b € S, then S = (a,b). 
The other possibilities lead to either S = (a, b] or S = [a, b). 

Case (ii): Let b := sup S. Then S C (—o0, b] and we will show that (—oo, b) C S. For, 
if z < b, then there exist x, y € S such that z € [x,y] C S. (Why?) Therefore (—00,b) CS. 
If b € S, then S = (—o0, b], and if b ¢ S, then S = (—oo, b) 

Cases (iii) and (iv) are left as exercises. Q.E.D. 


Nested Intervals 


We say that a sequence of intervals Z„,n € N, is nested if the following chain of inclusions 
holds (see Figure 2.5.1): 


fy Diy D+ 2h 2 Ti 2 


For example, if J,, := [0,1/n] for n € N, then J, D [,1, for each n € N so that this 
sequence of intervals is nested. In this case, the element 0 belongs to all J, and the 
Archimedean Property 2.4.3 can be used to show that 0 is the only such common point. 
(Prove this.) We denote this by writing N£ J, = {0}. 

It is important to realize that, in general, a nested sequence of intervals need not have a 
common point. For example, if J, := (0, 1/n) for n € N, then this sequence of intervals is 
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Figure 2.5.1 Nested intervals 


nested, but there is no common point, since for every given x > 0, there exists (why?) 
meéN such that 1/m< x so that xé Jm. Similarly, the sequence of intervals 
Ky, := (n,co),n € N, is nested but has no common point. (Why?) 

However, it is an important property of R that every nested sequence of closed, 
bounded intervals does have a common point, as we will now prove. Notice that the 
completeness of R plays an essential role in establishing this property. 


2.5.2 Nested Intervals Property Jf I, = |an, bn], n € N, is a nested sequence of closed 
bounded intervals, then there exists a number & € R such that £ € I, for all n € N. 


Proof. Since the intervals are nested, we have J, C J, for all € N, so that a, < b; for all 
n € N. Hence, the nonempty set {a, : n € N} is bounded above, and we let & be its 
supremum. Clearly a, < £ for all n € N. 

We claim also that € < b, for all n. This is established by showing that for any 
particular n, the number b, is an upper bound for the set {ap : k € N}. We consider two 
cases. (i) If n < k, then since I, D Ip, we have ap < by < by. (ii) If k < n, then since 
I, D Ty, we have ap < ay < bn. (See Figure 2.5.2.) Thus, we conclude that a, < b, for all 
k, so that b, is an upper bound of the set {ap : k € N}. Hence, E < b, for each n € N. Since 
an < & < bn for all n, we have £ € J, for all n € N. Q.E.D. 


ak an b, by 


Figure 2.5.2 If k <n, then J, C Ik 


2.5.3 Theorem /f1,, := |an, bn], n € N, is a nested sequence of closed, bounded intervals 
such that the lengths bn — an of I, satisfy 


inf{b, — an: n E€ N} =0, 
then the number & contained in I, for all n € N is unique. 
Proof. If n := inf{b, : n € N}, then an argument similar to the proof of 2.5.2 can be used 


to show that a, < n foralln, andhence that € < n. In fact, itis an exercise (see Exercise 10) to 
show that x € I, foralln € Nifandonly ifé < x < n.Ifwehaveinf{b, — a, : n € N} = 0, 
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then for any e > 0, there exists an m € N such that 0 < n —& < bm — am < €. Since this 
holds for all ¢ > 0, it follows from Theorem 2.1.9 that n — € = 0. Therefore, we conclude 
that £ = ņ is the only point that belongs to J, for every n € N. QED. 


The Uncountability of R 


The concept of a countable set was discussed in Section 1.3 and the countability of the set 
Q of rational numbers was established there. We will now use the Nested Interval Property 
to prove that the set R is an uncountable set. The proof was given by Georg Cantor in 1874 
in the first of his papers on infinite sets. He later published a proof that used decimal 
representations of real numbers, and that proof will be given later in this section. 


2.5.4 Theorem The set R of real numbers is not countable. 


Proof. We will prove that the unit interval J := [0, 1] is an uncountable set. This implies 
that the set R is an uncountable set, for if R were countable, then the subset J would also be 
countable. (See Theorem 1.3.9(a).) 

The proof is by contradiction. If we assume that J is countable, then we can enumerate 
the set as J = {X1,X2,...,Xy,...}. We first select a closed subinterval J; of J such that 
x, £ I, then select a closed subinterval J, of J, such that x. ¢ Jy, and so on. In this way, we 
obtain nonempty closed intervals 


HDDs De 


such that J, C Z and x, ¢ I, for all n. The Nested Intervals Property 2.5.2 implies that there 
exists a point E € J such that € € J, for all n. Therefore £ Æ x, for all n € N, so the 
enumeration of J is not a complete listing of the elements of J, as claimed. Hence, I is an 
uncountable set. Q.E.D. 


The fact that the set R of real numbers is uncountable can be combined with the fact that 
the set Q of rational numbers is countable to conclude that the set R\Q of irrational numbers 
is uncountable. Indeed, since the union of two countable sets is countable (see 1.3.7(c)), if 
R\Q is countable, then since R = Q U (R\Q), we conclude that R is also a countable set, 
which is a contradiction. Therefore, the set of irrational numbers R\Q is an uncountable set. 


Note: The set of real numbers can also be divided into two subsets of numbers called 
algebraic numbers and transcendental numbers. A real number is called algebraic if it is a 
solution of a polynomial equation P(x) = 0 where all the coefficients of the polynomial P 
are integers. A real number is called transcendental if it is not an algebraic number. It can 
be proved that the set of algebraic numbers is countably infinite, and consequently the set of 
transcendental numbers is uncountable. The numbers x and e are transcendental numbers, 
but the proofs of these facts are very deep. For an introduction to these topics, we refer the 
interested reader to the book by Ivan Niven listed in the References. 


Binary Representations 


We will digress briefly to discuss informally the binary (and decimal) representations of real 
numbers. It will suffice to consider real numbers between 0 and 1, since the representations 
for other real numbers can then be obtained by adding a positive or negative number. 


‘The remainder of this section can be omitted on a first reading. 
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If x € (0, 1], we will use a repeated bisection procedure to associate a sequence (a,,) of 
Os and 1s as follows. If x 4 5 belongs to the left subinterval [0, 1] we take a, := 0, while if 
x belongs to the right subinterval (5, 1] we take a, = 1. If x = 5s then we may take a; to 
be either 0 or 1. In any case, we have 


We now bisect the interval [ 5 a, 5 (ai + 1)] . If xis not the bisection point and belongs to the 
left subinterval we take az := 0, and if x belongs to the right subinterval we take a := 1. If 


eo ; orx = 3, we can take az to be either 0 or 1. In any case, we have 
a a a a+ 
=+ <x + . 
O am 2 2? 


We continue this bisection procedure, assigning at the nth stage the value a, := 0 if x is 
not the bisection point and lies in the left subinterval, and assigning the value a, := 1 if x 
lies in the right subinterval. In this way we obtain a sequence (an) of Os or Is that 
correspond to a nested sequence of intervals containing the point x. For each n, we have 
the inequality 
a a a a a an +1 

(2) Se Ge thes Se Sy 4 
If x is the bisection point at the nth stage, then x = m/2” with m odd. In this case, we may 
choose either the left or the right subinterval; however, once this subinterval is chosen, 
then all subsequent subintervals in the bisection procedure are determined. [For instance, if 
we choose the left subinterval so that a, = 0, then x is the right endpoint of all subsequent 
subintervals, and hence a, = 1 for all k > n + 1. On the other hand, if we choose the right 
subinterval so that a, = 1, then x is the left endpoint of all subsequent subintervals, and 
hence a, = 0 for all k > n + 1. For example, if x = 3, then the two possible sequences for 
x are 1,0, 1, 1, 1,... and 1, 1,0,0,0,....] 

To summarize: If x € [0,1], then there exists a sequence (a,) of Os and 1s such that 
inequality (2) holds for all n € N. In this case we write 


(3) x = (.d)a2°++An-+*)>, 


and call (3) a binary representation of x. This representation is unique except when 
x = m/2" for m odd, in which case x has the two representations 


x = (.4)2 +++ dy—11000---), = (-a1a2 +++ d,-10111---),, 


one ending in Os and the other ending in 1s. 

Conversely, each sequence of Os and 1s is the binary representation of a unique real 
number in [0,1]. The inequality corresponding to (2) determines a closed interval with 
length 1/2” and the sequence of these intervals is nested. Therefore, Theorem 2.5.3 implies 
that there exists a unique real number x satisfying (2) for every n € N. Consequently, x has 
the binary representation (.4)@2--+dy++-)>. 


Remark The concept of binary representation is extremely important in this era of digital 
computers. A number is entered in a digital computer on “‘bits,” and each bit can be put in 
one of two states—either it will pass current or it will not. These two states correspond to 
the values 1 and 0, respectively. Thus, the binary representation of a number can be stored 
in a digital computer on a string of bits. Of course, in actual practice, since only finitely 
many bits can be stored, the binary representations must be truncated. If n binary digits are 


2.5. INTERVALS 51 


used for a number x € [0, 1], then the accuracy is at most 1/2”. For example, to assure four- 
decimal accuracy, it is necessary to use at least 15 binary digits (or 15 bits). 


Decimal Representations 


Decimal representations of real numbers are similar to binary representations, except that 
we subdivide intervals into ten equal subintervals instead of two. 

Thus, given x € [0, 1], if we subdivide [0,1] into ten equal subintervals, then x belongs 
to a subinterval [b; /10, (bı + 1)/10] for some integer b; in {0,1,...,9}. Proceeding as in 
the binary case, we obtain a sequence (b,) of integers with 0 < b, < 9 for all n € N such 
that x satisfies 


b bo bn bi b ba+1 
4 Sra e e EE eae EAE 
(4) 10 12? S18 =* >To oS 10" 
In this case we say that x has a decimal representation given by 
x = .byby-+- bye. 


If x > 1 and if B € N is such that B < x < B+ 1, then x = B.D, by <- -bn +- where the 
decimal representation of x — B € [0,1] is as above. Negative numbers are treated 
similarly. 

The fact that each decimal determines a unique real number follows from Theorem 
2.5.3, since each decimal specifies a nested sequence of intervals with lengths 1/10”. 

The decimal representation of x € [0, 1] is unique except when x is a subdivision point at 
some stage, which can be seen to occur when x = m/10" for some m,n € N, 1 < m < 10". 
(We may also assume that m is not divisible by 10.) When x is a subdivision point at the nth 
stage, one choice for b, corresponds to selecting the left subinterval, which causes all 
subsequent digits to be 9, and the other choice corresponds to selecting the right subinterval, 

1 


which causes all subsequent digits to be 0. [For example, if x= ; then x= 


4999... = .5000---, and if y = 38/100 then y = .37999 - - - = .38000---.] 


Periodic Decimals 


A decimal B.b,b2 - -- b,- -- is said to be periodic (or to be repeating), if there exist k,n € N 
such that b, = by.» for all n > k. In this case, the block of digits byby41 +++ Dkẹ4m-1 is 
repeated once the Ath digit is reached. The smallest number m with this property is called the 
period of the decimal. For example, 19/88 = .2159090---90--- has period m = 2 with 
repeating block 90 starting at k = 4. A terminating decimal is a periodic decimal where the 
repeated block is simply the digit 0. 

We will give an informal proof of the assertion: A positive real number is rational if 
and only if its decimal representation is periodic. 

Suppose that x= p/q where p,q €N have no common integer factors. For 
convenience we will also suppose that 0 < p < q. We note that the process of “long 
division” of q into p gives the decimal representation of p/q. Each step in the division 
process produces a remainder that is an integer from 0 to q — 1. Therefore, after at most q 
steps, some remainder will occur a second time and, at that point, the digits in the quotient 
will begin to repeat themselves in cycles. Hence, the decimal representation of such a 
rational number is periodic. 

Conversely, if a decimal is periodic, then it represents a rational number. The idea of 
the proof is best illustrated by an example. Suppose that x = 7.31414---14---. We 
multiply by a power of 10 to move the decimal point to the first repeating block; here 
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obtaining 10x = 73.1414---. We now multiply by a power of 10 to move one block to the 
left of the decimal point; here getting 1000x = 7314.1414---. We now subtract to obtain 
an integer; here getting 1000x — 10x = 7314 — 73 = 7241, whence x = 7241/990, a 
rational number. 


Cantor’s Second Proof 


We will now give Cantor’s second proof of the uncountability of R. This is the elegant 
“diagonal” argument based on decimal representations of real numbers. 


2.5.5 Theorem The unit interval [0,1] := {x ER :0 < x< 1} is not countable. 


Proof. The proof is by contradiction. We will use the fact that every real number x € [0, 1] 
has a decimal representation x = 0.b,bb3 ---, where b; = 0,1,...,9. Suppose that there is 
an enumeration x;,X2,x3--- of all numbers in [0,1], which we display as: 


X = 0.b11b12b13 a -bin ee, 


X2 = 0.b21b22b23-+ + bay, 
X3 = 0.b31b32b33 see ban e, 


Xn = 0.bn1 bn2bn3 e... bnn e; 


We now define a real number y := 0.y1y2y3 +*+ Y>} by setting yı := 2 if bi; > 5 and 
yı := 7 if by, < 4; in general, we let 


© J2 ifPm >5, 
yn) 7 if bin <4. 


Then y € (0, 1]. Note that the number y is not equal to any of the numbers with two decimal 
representations, since y, Æ 0,9 for all n € N. Further, since y and x, differ in the nth 
decimal place, then y Æ x, for any n € N. Therefore, y is not included in the enumeration of 
[0,1], contradicting the hypothesis. Q.E.D. 


Exercises for Section 2.5 


1. IfI := [a,b] and 7’ := [a’,b'] are closed intervals in R, show that 7 C 7' if and only if a’ < a and 
b<b'. 


2. IfS C Ris nonempty, show that S is bounded if and only if there exists a closed bounded interval 
I such that S$ C J. 


3. IfS C Risanonempty bounded set, and Zs := [inf S, sup S], show that S C Is. Moreover, if J is 
any closed bounded interval containing S, show that Is C J. 


4. In the proof of Case (ii) of Theorem 2.5.1, explain why x, y exist in S. 
5. Write out the details of the proof of Case (iv) in Theorem 2.5.1. 


6. If 1; DI, D--- DI, D--- is a nested sequence of intervals and if Z, = [a,,b,], show that 
a La L-i Lan Soo and bi > bz >o >be: 


7. Let I, := [0,1/n] for n € N. Prove that NXZ, = {0}. 
Let J, := (0, 1/n) for n € N. Prove that NZ Jn =9. 
9. Let Kn := (n, œ) for n € N. Prove that NZK, =9. 


10. 


11. 
12. 
13. 


14. 


15. 
16. 
17. 
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With the notation in the proofs of Theorems 2.5.2 and 2.5.3, show that we have n € (V2 Jn. 
Also show that [£, n] = (jon. 


Show that the intervals obtained from the inequalities in (2) form a nested sequence. 
Give the two binary representations of 3 and te 


(a) Give the first four digits in the binary representation of i 
(b) Give the complete binary representation of i. 


Show that if a;, by, € {0,1,...,9} and if 


a M , an bı i bz i , bm 0 
10°10 410° 10°10 10%” ” 
then n = m and a, = by, fork =1,...,n. 


Find the decimal representation of — Z. 
1 2. . . . 
Express 5 and 7 as periodic decimals. 


What rationals are represented by the periodic decimals 1.25137---137--- and 
35.14653 ---653---? 


CHAPTER 3 


SEQUENCES AND SERIES 


Now that the foundations of the real number system R have been laid, we are prepared to 
pursue questions of a more analytic nature, and we will begin with a study of the 
convergence of sequences. Some of the early results may be familiar to the reader 
from calculus, but the presentation here is intended to be rigorous and will lead to certain 
more profound theorems than are usually discussed in earlier courses. 

We will first introduce the meaning of the convergence of a sequence of real numbers 
and establish some basic, but useful, results about convergent sequences. We then present 
some deeper results concerning the convergence of sequences. These include the 
Monotone Convergence Theorem, the Bolzano-Weierstrass Theorem, and the Cauchy 
Criterion for convergence of sequences. It is important for the reader to learn both the 
theorems and how the theorems apply to special sequences. 

Because of the linear limitations inherent in a book it is necessary to decide where to 
locate the subject of infinite series. It would be reasonable to follow this chapter with a full 
discussion of infinite series, but this would delay the important topics of continuity, 
differentiation, and integration. Consequently, we have decided to compromise. A brief 
introduction to infinite series is given in Section 3.7 at the end of this chapter, and a more 
extensive treatment is given later in Chapter 9. Thus readers who want a fuller discussion of 
series at this point can move to Chapter 9 after completing this chapter. 


Augustin-Louis Cauchy 

Augustin-Louis Cauchy (1789-1857) was born in Paris just after the start of 
the French Revolution. His father was a lawyer in the Paris police 
department, and the family was forced to flee during the Reign of Terror. 
As a result, Cauchy’s early years were difficult and he developed strong 
anti-revolutionary and pro-royalist feelings. After returning to Paris, Cau- 
chy’s father became secretary to the newly-formed Senate, which included 
the mathematicians Laplace and Lagrange. They were impressed by young 
Cauchy’s mathematical talent and helped him begin his career. 

He entered the Ecole Polytechnique in 1805 and soon established 
a reputation as an exceptional mathematician. In 1815, the year royalty was restored, he was 
appointed to the faculty of the Ecole Polytechnique, but his strong political views and his 
uncompromising standards in mathematics often resulted in bad relations with his colleagues. 
After the July revolution of 1830, Cauchy refused to sign the new loyalty oath and left France for 
eight years in self-imposed exile. In 1838, he accepted a minor teaching post in Paris, and in 1848 
Napoleon III reinstated him to his former position at the Ecole Polytechnique, where he remained 
until his death. 

Cauchy was amazingly versatile and prolific, making substantial contributions to many 
areas, including real and complex analysis, number theory, differential equations, mathematical 
physics and probability. He published eight books and 789 papers, and his collected works fill 
26 volumes. He was one of the most important mathematicians in the first half of the nineteenth 
century. 
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Section 3.1 Sequences and Their Limits 


A sequence in a set S is a function whose domain is the set N of natural numbers, and whose 
range is contained in the set S. In this chapter, we will be concerned with sequences in R 
and will discuss what we mean by the convergence of these sequences. 


3.1.1 Definition A sequence of real numbers (or a sequence in R) is a function defined 
on the set N = {1,2,...} of natural numbers whose range is contained in the set R of real 
numbers. 


In other words, a sequence in R assigns to each natural number n = 1, 2,...a 
uniquely determined real number. If X : N — R is a sequence, we will usually denote the 
value of X at n by the symbol x, rather than using the function notation X(n). The values x,, 
are also called the terms or the elements of the sequence. We will denote this sequence by 
the notations 


X, (Xn), (ninEN). 


Of course, we will often use other letters, such as Y = (y4), Z = (z;), and so on, to denote 
sequences. 

We purposely use parentheses to emphasize that the ordering induced by the natural 
order of N is a matter of importance. Thus, we distinguish notationally between the 
sequence (x, : n E€ N), whose infinitely many terms have an ordering, and the set of values 
{Xn : n € N} in the range of the sequence that are not ordered. For example, the sequence 
X := ((—1)" : n € N) has infinitely many terms that alternate between —1 and 1, whereas 
the set of values {(—1)” : n € N} is equal to the set {—1, 1}, which has only two elements. 

Sequences are often defined by giving a formula for the nth term x,,. Frequently, it is 
convenient to list the terms of a sequence in order, stopping when the rule of formation 
seems evident. For example, we may define the sequence of reciprocals of the even 


numbers by writing 
gee 1111 
Be og 2 d 4 2. 6 $ 8 3 3 


though a more satisfactory method is to specify the formula for the general term and write 


1 
X= (= Ine N) 
or more simply X = (1/2n). 

Another way of defining a sequence is to specify the value of x, and give a formula for 
Xn+1(m > 1) in terms of x,. More generally, we may specify x, and give a formula for 
obtaining x,,, from x1, X2,...,X,. Sequences defined in this manner are said to be 
inductively (or recursively) defined. 


3.1.2 Examples (a) If € R, the sequence B := (b,b,b,...), all of whose terms equal 
b, is called the constant sequence b. Thus the constant sequence 1 is the sequence 
(1, 1, 1,...), and the constant sequence 0 is the sequence (0, 0, 0,...). 
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(b) If b €R, then B := (b") is the sequence B = (b,b*,b*,...,b",...). In particular, if 
b= 5s then we obtain the sequence 


1 111 1 
(zinen) = Gaee] 


(c) The sequence of (2n : n € N) of even natural numbers can be defined inductively by 
X= 2; Xn41 = Xn F 2, 
or by the definition 
V152, Yn = Vi HY 
(d) The celebrated Fibonacci sequence F := (f,,) is given by the inductive definition 
fi=l, fol, fat t= fatty (n22). 


Thus each term past the second is the sum of its two immediate predecessors. The first ten 
terms of F are seen to be (1, 1, 2, 3, 5, 8, 13, 21, 34, 55,...). 


The Limit of a Sequence 


There are a number of different limit concepts in real analysis. The notion of limit of a 
sequence is the most basic, and it will be the focus of this chapter. 


3.1.3 Definition A sequence X = (x,) in R is said to converge to x € R, or x is said to 
be a limit of (x,,), if for every € > 0 there exists a natural number K(«) such that for all 
n> K(e), the terms x, satisfy |x, — x| < €. 

If a sequence has a limit, we say that the sequence is convergent; if it has no limit, we 
say that the sequence is divergent. 


Note The notation K(«) is used to emphasize that the choice of K depends on the value 
of e. However, it is often convenient to write K instead of K (e). In most cases, a “small” 
value of ¢ will usually require a “large” value of K to guarantee that the distance |x,, — x| 
between x, and x is less than e for all n > K = K(e). 


When a sequence has limit x, we will use the notation 
limX =x or lim(x,) =~. 


We will sometimes use the symbolism x, — x, which indicates the intuitive idea that the 
values x, “approach” the number x as n — oo. 


3.1.4 Uniqueness of Limits A sequence in R can have at most one limit. 


Proof. Suppose that x’ and x” are both limits of (xn). For each ¢ > 0 there exist K’ such 
that |x, — x'| < ¢/2 for all n > K’, and there exists K” such that |x, — x”| < ¢/2 for all 
n > K". We let K be the larger of K’ and K”. Then for n > K we apply the Triangle 
Inequality to get 


|x’ _ x"| = 


|x! — Xn +X, — x 
< |x’ — Xn| + |Xn — "| < 6/2 + £€/2 = e. 


“I 


Since € > 0 is an arbitrary positive number, we conclude that x’ — x” = 0. QED. 
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For x € R and e > 0, recall that the -neighborhood of x is the set 
V(x) := {u E R : |u — x| < e}. 


(See Section 2.2.) Since u € V,(x) is equivalent to |u — x| < e, the definition of conver- 
gence of a sequence can be formulated in terms of neighborhoods. We give several different 
ways of saying that a sequence x, converges to x in the following theorem. 


3.1.5 Theorem Let X = (x,) be a sequence of real numbers, and let x€ R. The 
following statements are equivalent. 


(a) X converges to x. 

(b) For every € > 0, there exists a natural number K such that for alln > K, the terms x, 
satisfy |Xn — x| < €. 

(c) For every e > 0, there exists a natural number K such that for alln > K, the terms x, 
satisfy X — E< Xn < X +E. 

(d) For every s-neighborhood V,(x) of x, there exists a natural number K such that for 
all n > K, the terms xn belong to V(x). 


Proof. The equivalence of (a) and (b) is just the definition. The equivalence of (b), (c), and 
(d) follows from the following implications: 


ju=- x| <€ <= > -e<u-x<e = x-e<u<xte = uEV(x). 
QED. 


With the language of neighborhoods, one can describe the convergence of the 
sequence X = (x,) to the number x by saying: for each s-neighborhood Vx) of x, all 
but a finite number of terms of X belong to V, (x). The finite number of terms that may not 
belong to the ¢-neighborhood are the terms x1, X2,...,XK_1. 


Remark The definition of the limit of a sequence of real numbers is used to verify that a 
proposed value x is indeed the limit. It does not provide a means for initially determining 
what that value of x might be. Later results will contribute to this end, but quite often it is 
necessary in practice to arrive at a conjectured value of the limit by direct calculation of a 
number of terms of the sequence. Computers can be helpful in this respect, but since they 
can calculate only a finite number of terms of a sequence, such computations do not in any 
way constitute a proof of the value of the limit. 


The following examples illustrate how the definition is applied to prove that a 
sequence has a particular limit. In each case, a positive € is given and we are required 
to find a K, depending on ¢, as required by the definition. 


3.1.6 Examples (a) lim(1/n) = 0. 

If e > 0 is given, then 1/e > 0. By the Archimedean Property 2.4.3, there is a natural 
number K = K(e) such that 1/K < e. Then, if n > K, we have 1/n < 1/K < e. Conse- 
quently, if n > K, then 
1 


| 1 
-0l =-<e. 
n n 


Therefore, we can assert that the sequence (1/n) converges to 0. 
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(b) lim(/(n? + 1)) = 0. 
Let ¢ > 0 be given. To find K, we first note that if n € N, then 
1 1 01 
et Sn 
Now choose K such that 1/K < e, as in (a) above. Then n > K implies that 1/n < £, and 
therefore 


w+ n2+1l `n ‘ 


1 | 1 
Hence, we have shown that the limit of the sequence is zero. 


(c) lim ==) =3. 


n+1 
Given ¢ > 0, we want to obtain the inequality 


(1) 


when 7 is sufficiently large. We first simplify the expression on the left: 


3n+2 aa = 1 1 


n+1 n+1 ~In+1 nid on 


Now if the inequality 1/n < € is satisfied, then the inequality (1) holds. Thus if 1/K < e, 
then for any n > K, we also have 1/n < € and hence (1) holds. Therefore the limit of the 
sequence is 3. 
(d) lim(/n+1— yn) =0. 

We multiply and divide by vn + 1 + yn to get 


(Vn+1—Vn)(Vn+1+Vn)  n+1-n 


s| = 


vn+1+vyn = yn+l+vn 
= l < l 
vn+l+vyn vn 


For a given e > 0, we obtain 1/y/n < eif and only if 1/n < £ orn > 1/e?. Thus if we take 
K > 1/2, then /n + 1— yn < e for all n > K. (For example, if we are given ¢ = 1/10, 
then K > 100 is required.) 
(e) If0 < b< 1, then lim(b”) = 0. 
We will use elementary properties of the natural logarithm function. If ¢ > 0 is given, 
we see that 
b" <e => nlnb<lne = n> lne/lnb. 


(The last inequality is reversed because In b < 0.) Thus if we choose K to be a number such 
that K > In¢/In b, then we will have 0 < b” < e for all n > K. Thus we have lim(b") = 0. 

For example, if b = .8, and if e = .01 is given, then we would need K > 
In .01/In .8 ~ 20.6377. Thus K = 21 would be an appropriate choice for ¢ = .01. 


Remark The K(z) Game Inthe notion of convergence of a sequence, one way to keep 
in mind the connection between the ¢ and the K is to think of it as a game called the K (e) 
Game. In this game, Player A asserts that a certain number x is the limit of a sequence (x,). 
Player B challenges this assertion by giving Player A a specific value for € > 0. Player A 
must respond to the challenge by coming up with a value of K such that |x, — x| < e for all 
n > K. If Player A can always find a value of K that works, then he wins, and the sequence 
is convergent. However, if Player B can give a specific value of € > 0 for which Player A 
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cannot respond adequately, then Player B wins, and we conclude that the sequence does 
not converge to x. 


In order to show that a sequence X = (x,) does not converge to the number x, it is 
enough to produce one number £o > 0 such that no matter what natural number K is chosen, 
one can find a particular nx satisfying nx > K such that |x, — x| > £o. (This will be 
discussed in more detail in Section 3.4.) 


3.1.7 Example The sequence (0, 2, 0, 2,...,0,2,...) does not converge to the 
number 0. 

If Player A asserts that 0 is the limit of the sequence, he will lose the K(¢) Game when 
Player B gives him a value of € < 2. To be definite, let Player B give Player A the value 
€o = 1. Then no matter what value Player A chooses for K, his response will not be 
adequate, for Player B will respond by selecting an even number n > K. Then the 
corresponding value is x, = 2 so that |x, — O| = 2 > 1 = eo. Thus the number 0 is not 
the limit of the sequence. 


Tails of Sequences 


It is important to realize that the convergence (or divergence) of a sequence X = (xn) 
depends only on the “ultimate behavior” of the terms. By this we mean that if, for any 
natural number m, we drop the first m terms of the sequence, then the resulting sequence X, 
converges if and only if the original sequence converges, and in this case, the limits are the 
same. We will state this formally after we introduce the idea of a “tail” of a sequence. 


3.1.8 Definition If X = (x1, X2,...,Xn,---) isa sequence of real numbers and if m is a 
given natural number, then the m-tail of X is the sequence 


Xin i= (Xmen Ine N) = (Xm41, Xm42; -| .) 


For example, the 3-tail of the sequence X = (2,4,6,8,10,...,2n,...), is the sequence 
X3 = (8, 10, 12,...,2746,...). 


3.1.9 Theorem Let X = (x, : n € N) be a sequence of real numbers and let m € N. 
Then the m-tail Xm = (Xm+n : n € N) of X converges if and only if X converges. In this 
case, lim X„ = lim X. 


Proof. We note that for any p € N, the pth term of X, is the (p + m)th term of X. 
Similarly, if q > m, then the qth term of X is the (q — m)th term of Xm. 

Assume X converges to x. Then given any e > 0, if the terms of X for n > K(e) satisfy 
|x, — x| < e, then the terms of X, fork > K(e) — m satisfy |x, — x| < e. Thus we can take 
Km(£) = K(e) — m, so that X,,, also converges to x. 

Conversely, if the terms of X,,, for k > Km(e) satisfy |x, — x| < e, then the terms of X 
for n > K(e) +m satisfy |x, — x| < e. Thus we can take K(e) = K,,(e) + m. 

Therefore, X converges to x if and only if X,,, converges to x. QED. 


We shall sometimes say that a sequence X ultimately has a certain property 
if some tail of X has this property. For example, we say that the sequence 
(3,4,5,5,5,...,5,...) is “ultimately constant.” On the other hand, the sequence 
(3,5,3,5,...,3,5,...) is not ultimately constant. The notion of convergence can be 
stated using this terminology: A sequence X converges to x if and only if the terms of 
X are ultimately in every -neighborhood of x. Other instances of this ‘‘ultimate 
terminology” will be noted later. 
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Further Examples 


In establishing that a number x is the limit of a sequence (x,,), we often try to simplify 
the difference |x, — x| before considering an € > 0 and finding a K (e) as required by the 
definition of limit. This was done in some of the earlier examples. The next result is a 
more formal statement of this idea, and the examples that follow make use of this 
approach. 


3.1.10 Theorem Let (x,) be a sequence of real numbers and let x € R. If (ay) is a 
sequence of positive real numbers with lim(a,) = 0 and if for some constant C > 0 and 
some m € N we have 


|xn— x| < Ca, forall n>m, 
then it follows that lim(x,) = x. 
Proof. If ¢ > Ois given, then since lim(a„) = 0, we know there exists K = K(¢/C) such 
that n > K implies 
an = |an — 0| < &/C. 
Therefore it follows that if both n > K and n > m, then 
|Xn — x| < Can < C(e/C) =e. 


Since ¢ > 0 is arbitrary, we conclude that x = lim(x,). QED. 


1 ) 
=0 
+ na 


Since a > 0, then 0 < na < 1 + na, and therefore 0 < 1/(1 + na) < 1/(na). Thus 
we have 


3.1.11 Examples (a) If a> 0, then im(; 


1 1\1 
| o| < ( ) forall n € N. 
1 + na ajn 


Since lim(1/n) = 0, we may invoke Theorem 3.1.10 with C = 1/a and m = | to infer that 
lim (1/(1 + na)) = 0. 
(b) If 0 <b <1, then lim(b”) = 0. 

This limit was obtained earlier in Example 3.1.6(e). We will give a second proof that 
illustrates the use of Bernoulli ’s Inequality (see Example 2.1.13(c)). 

Since 0 < b < 1, we can write b = 1/(1 + a), where a := (1/b) —1 so that a > 0. By 
Bernoulli’s Inequality, we have (1 + a)” > 1 + na. Hence 


1 1 
(1 +a)” S Caa na’ 
Thus from Theorem 3.1.10 we conclude that lim(b”) = 0. 

In particular, if b = .8, so that a = .25, and if we are given ¢ = .01, then the preceding 
inequality gives us K(e) = 4/(.01) = 400. Comparing with Example 3.1.6(e), where we 
obtained K = 21, we see this method of estimation does not give us the “best” value of K. 
However, for the purpose of establishing the limit, the size of K is immaterial. 

(c) Ifc > 0, then lim(c!/”) = 1. 

The case c = 1 is trivial, since then (c'”) is the constant sequence (1, 1,...), which 

evidently converges to 1. 


0<h"= 
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Ifc>1,thenc!”=1+4 d, for some d, > 0. Hence by Bernoulli’s Inequality 2.1.13(c), 
c=(l+d,)">1+nd, for neéN. 


Therefore we have c — 1 > nd,, so that d, < (c — 1)/n. Consequently we have 


lc" —1)=d,<(e-1)— for neéN. 


Sle 


We now invoke Theorem 3.1.10 to infer that lim(c!/”) = 1 when c > 1. 
Now suppose that 0 <c <1; then c” = 1/(1 + h,) for some h, > 0. Hence 
Bernoulli’s Inequality implies that 


1 2 1 r 1 
(1+) T 1+nh, ` nh,’ 


C= 


from which it follows that 0 < A, < 1/nc for n € N. Therefore we have 


h 1 
In < — 


p= 1/n _ 
0< c [he a; 


so that 
1\ 1 
"1 < (4) for neéN. 
cjn 


We now apply Theorem 3.1.10 to infer that lim(c!/”) = 1 when 0 < c < 1. 
(d) lim(n!/") = 1 

Since n!/" > 1 forn > 1, we can write n'/” = 1+k, for some k, > 0 when n > 1. 
Hence n = (1+ k)” for n > 1. By the Binomial Theorem, if n > 1 we have 


n=1+nk,+4n(n— 1k +--->1+44n(n— 1)k, 


whence it follows that 
n—12>4n(n— Dk. 


Hence ke < 2/nforn > 1. Ife > Ois given, it follows from the Archimedean Property that 
there exists a natural number N, such that 2/N, < é*. It follows that if n > sup{2, N,} then 
2/n < e°, whence 


0 <n” — 1 = ky < (2/n)'? < e. 


Since ¢ > 0 is arbitrary, we deduce that lim(n!/”) = 1. 


Exercises for Section 3.1 


1. The sequence (x,,) is defined by the following formulas for the nth term. Write the first five terms 
in each case: 


(a) xn:=1+ (-1)", (b) Xy c= (—1)"/n, 
1 
n(n+1)’ 


(Cc) Xn:= (d) x:= 


1 
n2 +2’ 
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The first few terms of a sequence (x, are given below. Assuming that the “natural pattern” 
indicated by these terms persists, give a formula for the mth term x,. 

(a) 5,7,9,11,..., (b) 1/2, -1/4, 1/8, -1/16,..., 

(c) 1/2, 2/3, 3/4, 4/5,..., (d) 1,4,9, 16,.... 


List the first five terms of the following inductively defined sequences. 
(a) xp = l, Xn = 3x, +1, 

(b) vi 2, Yn+1 = On cs 2/9n)s 

(c) Zi: 1, 22: 2, Zn+2 © (Zn+1 H Zn) /(Zn+1 a Zn)s 

(d) 8) :=3, 82 :=5, Sn42 = Sn + Spy. 


For any b € R, prove that lim(b/n) = 0. 


Use the definition of the limit of a sequence to establish the following limits. 


3 1 3 1 1 
(c) lim G 5) = 3? (d) lim G TI 5) = z 
Show that 
: 1 s 2n 
(a) in) =0, (b) lim (; z) = 2, 
; va \ _ . ((-l)"n\ | 
(c) lim (4) = 0, (d) in(G 5: 7) =0. 


Let x, := 1/In(n + 1) forn € N. 


(a) Use the definition of limit to show that lim(x,) = 0. 
(b) Finda specific value of K(e) as required in the definition of limit for each of (i) e = 1/2, and 
(ii) e = 1/10. 


Prove that lim(x,) =0 if and only if lim(|x,|) = 0. Give an example to show that the 
convergence of (|x,,|) need not imply the convergence of (x,). 


Show that if x, > 0 for all n € N and lim(x,,) = 0, then lim(,/x;) = 0. 


Prove that if lim(x,,) = x and if x > 0, then there exists a natural number M such that x, > 0 for 
all n > M. 


Show that lim G — 1) = 
non 


Show that lim( v n? + 1 — n) = 0. 
Show that lim(1/3”) = 0. 


Let b € R satisfy 0 < b < 1. Show that lim(nb") = 0. [Hint: Use the Binomial Theorem as in 
Example 3.1.11(d).] 


Show that tim ((2n)'/") =1. 
Show that lim(n?/n!) = 0. 
Show that lim(2”/n!) = 0. [Hint: If n > 3, then 0 < 2" /n! < 2()" 71 


If lim(x,) = x > 0, show that there exists a natural number K such that if n > K, then 
ix < Xp < 2x. 
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Section 3.2 Limit Theorems 


In this section we will obtain some results that enable us to evaluate the limits of certain 
sequences of real numbers. These results will expand our collection of convergent 
sequences rather extensively. We begin by establishing an important property of conver- 
gent sequences that will be needed in this and later sections. 


3.2.1 Definition A sequence X = (xn) of real numbers is said to be bounded if there 
exists a real number M > 0 such that |x,| < M for all n € N. 


Thus, the sequence (x,,) is bounded if and only if the set {x, : n € N} of its values is a 
bounded subset of R. 


3.2.2 Theorem A convergent sequence of real numbers is bounded. 


Proof. Suppose that lim(x,) = x and let e := 1. Then there exists a natural number 
K = K(1) such that |x, — x| < 1 for all n > K. If we apply the Triangle Inequality with 
n> K we obtain 

[Xn] = [xn — xX + x| < [Xn — x| + [x] < 1+ |x]. 


If we set 
M := sup{|xi|, |x2|, - aed) |xx-1l, 1+ |x|}, 


then it follows that |x,| < M for all n € N. Q.E.D. 


Remark We can also prove a convergent sequence (x,) is bounded using the language of 
neighborhoods. If V,(x) is a given neighborhood of the limit x, then all but a finite number 
of terms of the sequence belong to V,(x). Therefore, since V,(x) is clearly bounded and 
finite sets are bounded, it follows that the sequence is bounded. 


We will now examine how the limit process interacts with the operations of addition, 
subtraction, multiplication, and division of sequences. If X = (x,) and Y = (y,) are 
sequences of real numbers, then we define their sum to be the sequence X + Y: = (x, + yn), 
their difference to be the sequence X — Y := (x, — y,), and their product to be the 
sequence X- Y := (x,y,). If c € R, we define the multiple of X by c to be the sequence 
cX := (cxn). Finally, if Z = (Z,) is a sequence of real numbers with z, Æ 0 for all n € N, 
then we define the quotient of X and Z to be the sequence X/Z := (X;/Zn). 

For example, if X and Y are the sequences 

111 1 
X := (2, 4,6,...,2,...), Y= (on 


then we have 


xer = (apea m) 
Pee Ae i 
= r= (722...) 

DIa a, ee à 
XY = DD Oo 


3X = (6,12,18,...,6n,...), 
X/Y = (2,8,18,...,20?,...). 
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We note that if Z is the sequence 
Z:= (0,2,0,...,1+(-1",..-), 


then we can define X + Z, X — Zand X - Z, but X/Z is not defined since some of the terms of 
Z are Zero. 

We now show that sequences obtained by applying these operations to convergent 
sequences give rise to new sequences whose limits can be predicted. 


3.2.3 Theorem (a) Let X = (x,) and Y = (y,) be sequences of real numbers that 
converge to x andy, respectively, and let c € R. Then the sequences X + Y,X—Y,X-Y,and 
cX converge tox + y, x — y, xy, and cx, respectively. 


(b) If X = (x,) converges to x and Z = (Zp) is a sequence of nonzero real numbers that 
converges to z and if z #0, then the quotient sequence X/Z converges to x/z. 


Proof. (a) To show that lim(x, + y,) = x + y, we need to estimate the magnitude of 
|(Xn +y,) — (x + y)|. To do this we use the Triangle Inequality 2.2.3 to obtain 


(Xn + Yn) — +) = [On = x) + On = | 
< |Xn =X |+ [Yn = yl. 
By hypothesis, if ¢ > 0 there exists a natural number K, such that ifn > Ky, then |x, — x| < 


£€/2; also there exists a natural number K, such that if n > K3, then |y, — y| < 6/2. Hence if 
K(e) := sup{K,, K2}, it follows that if n > K(e) then 


(Xn + Yn) — (x+ y)| < [xn = x| + Dy — y| 


Ee E 
zË JETS 


Since ¢ > 0 is arbitrary, we infer that X + Y = (x, + y„) converges to x + y. 
Precisely the same argument can be used to show that X — Y = (x, — y,,) converges to 


x-—y. 
To show that X - Y = (x,y,,) converges to xy, we make the estimate 
[XnYn = XY] = [nn = Xny) + (xry — x)| 
< [Xnr — y) + Qn = x)y 
= |nll¥n = y| + [Xm = xllyl. 
According to Theorem 3.2.2 there exists a real number M, > 0 such that |x,,| < Mı for all 
n € N and we set M := sup{M,, |y|}. Hence we have 
[Xn¥n — Y| < M|Yn — y| + M|xn — xl. 


From the convergence of X and Y we conclude that if ¢ > 0 is given, then there exist natural 
numbers K, and K> such that if n > Kı then |x, — x| < ¢/2M, and if n > K then 
ly, — y| < €/2M. Now let K(e) = sup{K,, K2}; then, if n > K(e) we infer that 
XnYn — XY] < Mlyn — y| + M|xXn — x| 
< M(é/2M) + M(e/2M) = e. 


Since € > 0 is arbitrary, this proves that the sequence X - Y = (x,y,,) converges to xy. 

The fact that cX = (cx,) converges to cx can be proved in the same way; it can also be 
deduced by taking Y to be the constant sequence (c,c,c,...). We leave the details to the 
reader. 
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(b) We next show that if Z = (Zn) is a sequence of nonzero numbers that converges to a 
nonzero limit z, then the sequence (1/z,) of reciprocals converges to 1/z. First let œ := 4 |z| 
so that a >0. Since lim(z,) =z, there exists a natural number Kı such that if 
n> K, then |z, —2z| <a. It follows from Corollary 2.2.4(a) of the Triangle Inequality 
that —æ < —|z, — Z| < |z,| — |z| for n > Kı, whence it follows that ¿|z| = |z| — a < |z,| 
for n > Kı. Therefore 1/|z,| < 2/|z| for n > Kı so we have the estimate 


1 1 Z— Zn 1 
= = |Z — Z,| 
Za Z ZnZ \Zn2| 
=ie forall n> Kı. 
Z 


Now, if ¢ >0 is given, there exists a natural number K> such that if n > K then 
|Zn — z| < telz}. Therefore, it follows that if K(¢) = sup{K1, K2}, then 


—-—-| <e forall n> K(e). 
z 


Since € > 0 is arbitrary, it follows that 


lim| — ]) = —. 
Za Z 


The proof of (b) is now completed by taking Y to be the sequence (1/z„) and using the fact 
that X - Y = (x,/Zn) converges to x(1/z) = x/z. Q.E.D. 


Some of the results of Theorem 3.2.3 can be extended, by Mathematical Induction, to a 
finite number of convergent sequences. For example, if A = (an), B = (b;),..., Z = (Zn) 
are convergent sequences of real numbers, then their sum A+B+---+Z= (a, + bn 
+--+ + Zn) is a convergent sequence and 


(1) lim(an + bn + +++ + Zn) = lim(a,) + lim(b,) +--+ + lim(z,). 
Also their product A- B - - - Z := (anbn +- Zn) is a convergent sequence and 
(2) lim(@ybn-+++Zn) = (lim(a,)) (lim(b,)) +--+ (lim(z,)). 
Hence, if k € N and if A = (an) is a convergent sequence, then 

(3) lim(a%) = (lim(a,))*. 


We leave the proofs of these assertions to the reader. 


3.2.4 Theorem /fX = (x,) is a convergent sequence of real numbers and if x, > 0 for all 
n € N, then x = lim(x,) > 0. 


Proof. Suppose the conclusion is not true and that x < 0; then e := —x is positive. Since X 
converges to x, there is a natural number K such that 
xXx-&E<x,<x+e forall n>K. 


In particular, we have xx < x + € = x + (—x) = 0. But this contradicts the hypothesis 
that x, > 0 for all n € N. Therefore, this contradiction implies that x > 0. Q.E.D. 


We now give a useful result that is formally stronger than Theorem 3.2.4. 


3.2.5 Theorem [fX = (xn) and Y = (y,) are convergent sequences of real numbers and 
if Xn < Yn for all n € N, then lim(x,) < lim(y,). 
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Proof. Let z, := y„ — Xn so that Z := (z,) = Y — X and z, > 0 for all n € N. It follows 
from Theorems 3.2.3 and 3.2.4 that 


0 < limZ = lim(y,) — lim(x,), 


so that lim(x,) < lim(y,). QED. 


The next result asserts that if all the terms of a convergent sequence satisfy an inequality 
of the forma < x, < b, then the limit of the sequence satisfies the same inequality. Thus if the 
sequence is convergent, one may “pass to the limit” in an inequality of this type. 


3.2.6 Theorem Jf X = (xn) is a convergent sequence and if a < xn < b for all n € N, 
then a < lim(x,) < b. 


Proof. Let Y be the constant sequence (b, b, b,...). Theorem 3.2.5 implies that 
lim X < lim Y = b. Similarly one shows that a < lim X. Q.E.D. 


The next result asserts that if a sequence Y is squeezed between two sequences that 
converge to the same limit, then it must also converge to this limit. 


3.2.7 Squeeze Theorem Suppose that X = (xn), Y = (y,), and Z = (Zn) are sequences 
of real numbers such that 
Xn LS Yn <2n forall neN, 


and that lim(x,) = lim(z,). Then Y = (y„) is convergent and 


lim(x,) = lim(y,,) = lim(z,). 


Proof. Let w := lim(x,) = lim(z,). If e > 0 is given, then it follows from the conver- 
gence of X and Z to w that there exists a natural number K such that if n > K then 
|X, -—w| <e and Zn —w| < €. 
Since the hypothesis implies that 
Xn, —-W<y,-w<z,—-w forall néEN, 
it follows (why?) that 
—é<y,-W<e 


for all n > K. Since ¢ > 0 is arbitrary, this implies that lim(y,,) = w. QED. 


Remark Since any tail of a convergent sequence has the same limit, the hypotheses of 
Theorems 3.2.4, 3.2.5, 3.2.6, and 3.2.7 can be weakened to apply to the tail of a sequence. 
For example, in Theorem 3.2.4, if X = (xn) is “ultimately positive” in the sense that there 
exists m € N such that x, > 0 for all n > m, then the same conclusion that x > 0 will hold. 


Similar modifications are valid for the other theorems, as the reader should verify. 


3.2.8 Examples (a) The sequence (n) is divergent. 

It follows from Theorem 3.2.2 that if the sequence X := (n) is convergent, then there 
exists a real number M > 0 such that n = |n| < M for all n € N. But this violates the 
Archimedean Property 2.4.3. 


3.2 LIMIT THEOREMS 67 


(b) The sequence ((—1)") is divergent. 


This sequence X = ((—1)”) is bounded (take M := 1), so we cannot invoke 
Theorem 3.2.2. However, assume that a := lim X exists. Let € := 1 so that there exists 
a natural number K, such that 


\(—1)"—a| <1 forall n> Kj. 


If nis an odd natural number with n > K; this gives | — 1 — a| < 1, so that —2 < a < 0. On 
the other hand, if n is an even natural number with n > Kj, this inequality gives 
|l—a|< 1 so that 0<a< 2. Since a cannot satisfy both of these inequalities, the 
hypothesis that X is convergent leads to a contradiction. Therefore the sequence X is 
divergent. 


(c) lim (= = n) =2. 
n 


If we let X := (2) and Y := (1/n), then ((2n + 1)/n) = X +Y. Hence it follows 
from Theorem 3.2.3(a) that lim(X + Y) = lim X + lim Y = 2 +0 = 2. 


2n+ 1 
d) li =2 
(a) im( n+5 ) 
Since the sequences (2n + 1) and (n + 5) are not convergent (why?), it is not possible 


to use Theorem 3.2.3(b) directly. However, if we write 


2nt+1 2+1/n 
n+5  14+5/n’ 


we can obtain the given sequence as one to which Theorem 3.2.3(b) applies when we 
take X := (2 + 1/n) and Z := (1 + 5/n). (Check that all hypotheses are satisfied.) Since 
lim X = 2 and lim Z = 1 £ 0, we deduce that lim((2 + 1)/(n + 5)) = 2/1 =2. 


; 2n 
(e) lim (=) =0 


Theorem 3.2.3(b) does not apply directly. (Why?) We note that 
2n oD 
n2+1 n+l/n’ 
but Theorem 3.2.3(b) does not apply here either, because (n + 1/n) is not a convergent 
sequence. (Why not?) However, if we write 


2n  2/n 
n2+1 1+1/n?’ 


then we can apply Theorem 3.2.3(b), since lim(2/n) = 0 and lim(1 + 1/n*) = 1 #0. 
Therefore lim(2n/(n? + 1)) = 0/1 = 0. 
®© lim (= =0. 
n 

We cannot apply Theorem 3.2.3(b) directly, since the sequence (n) is not convergent 
[neither is the sequence (sin n)]. It does not appear that a simple algebraic manipulation 
will enable us to reduce the sequence into one to which Theorem 3.2.3 will apply. However, 
if we note that —1 < sinn < 1, then it follows that 


1 i 1 
P ae forall neN. 


n n n 
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Hence we can apply the Squeeze Theorem 3.2.7 to infer that lim(n~'sinn) = 0. (We note 
that Theorem 3.1.10 could also be applied to this sequence.) 

(g) Let X = (xn) be a sequence of real numbers that converges to x € R. Let p be a 
polynomial; for example, let 


P(t) := akt + agit! + +--+ ait + ao, 
where k € N and a; € R for j = 0, 1,..., k. It follows from Theorem 3.2.3 that the 
sequence (p(x,)) converges to p(x). We leave the details to the reader as an exercise. 
(h) Let X = (x,) be a sequence of real numbers that converges to x € R. Let r be a 
rational function (that is, r(t) := p(t)/q(t), where p and q are polynomials). Suppose that 


q(xn) #0 for all n€ N and that g(x) 40. Then the sequence (r(x,)) converges to 
r(x) = p(x)/q(x). We leave the details to the reader as an exercise. 


We conclude this section with several results that will be useful in the work that 
follows. 


3.2.9 Theorem Let the sequence X = (x,) converge to x. Then the sequence (|X,|) of 
absolute values converges to |x|. That is, if x = lim(x,), then |x| = lim(|x,|). 


Proof. It follows from the Triangle Inequality (see Corollary 2.2.4(a)) that 
[ln] — Ixl] < [xn — x| forall neN. 


The convergence of (|x,|) to |x| is then an immediate consequence of the convergence of 
(Xp) to x. Q.E.D. 


3.2.10 Theorem Let X = (xn) be a sequence of real numbers that converges to x and 
suppose that xn > 0. Then the sequence (,/Xn) of positive square roots converges and 


lim( /%m) = v3. 


Proof. It follows from Theorem 3.2.4 that x = lim(x,) > 0 so the assertion makes sense. 
We now consider the two cases: (i) x = 0 and (ii) x > 0. 

Case (i) If x =0, let ¢ > 0 be given. Since x, — 0 there exists a natural number K 
such that if n > K then 


0<x,=XxX,-0<e. 


Therefore [see Example 2.1.13(a)],0 < x, < € for n > K. Since e > 0 is arbitrary, this 
implies that /x, — 0. 
Case (ii) If x > 0, then \/x > 0 and we note that 


X x= (V%n — Vx) (Xn + VX) a Xn— xX 
V%n = vx ae = 


Since ,/Xn + Vx > yx > 0, it follows that 
1 
Raz > vx| < (=) |Xn ai x|: 


The convergence of ,/x; — yx follows from the fact that x, — x. Q.E.D. 


For certain types of sequences, the following result provides a quick and easy “ratio 
test” for convergence. Related results can be found in the exercises. 
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3.2.11 Theorem Let (xn) be a sequence of positive real numbers such that L:= 
lim(Xp41/Xn) exists. If L < 1, then (xn) converges and lim(x,) = 0. 


Proof. By 3.2.4 it follows that L > 0. Let r be a number such that L < r < 1, and let 
e :=r — L > 0. There exists a number K € N such that if n > K then 


Xn+1 
Xn 


I| <e 


It follows from this (why?) that if n > K, then 


Xn+1 
Xn 


<L+ée=L+(r—-L)=r. 


Therefore, if n > K, we obtain 


O < aja <r <r Se ger TEL, 


If we set C := x/r“, we see that 0 < X41 < Cr”*+! for all n > K. Since 0 <r < 1, it 
follows from 3.1.11(b) that lim(r”)=0 and therefore from Theorem 3.1.10 that 
lim(x;,) = 0. QED. 


As an illustration of the utility of the preceding theorem, consider the sequence (xn) 
given by x, :=n/2”. We have 


Xp n+l 2” 1 1 
= . = 1 
Xn nE 2 t nj’ 


so that lim(x,.1/x,) = 5. Since $ < 1, it follows from Theorem 3.2.11 that lim(n/2”) = 0. 


Exercises for Section 3.2 


1. For x, given by the following formulas, establish either the convergence or the divergence of the 
sequence X = (xn). 


n (-1)"n 
(a) Xn t= Wael? (b) Xn c= peek? 
n 2n? +3 
(c) Xn Spa? (d) Xn := ats 
2. Give an example of two divergent sequences X and Y such that: 
(a) their sum X + Y converges, (b) their product XY converges. 


Show that if X and Y are sequences such that X and X + Y are convergent, then Y is convergent. 


4. Show that if X and Y are sequences such that X converges to x Æ 0 and XY converges, then Y 
converges. 


5. Show that the following sequences are not convergent. 


@ (2), ©) ((=1)"7?). 
6. Find the limits of the following sequences: 
f ; —1)” 
(a) lim ((2 rs 1/n)°), (b) im(£ =). 
. (vn-1 _ (n+l 
(c) tim(“E—), (d) im (24); 
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10. 


11. 


12. 


13. 
14. 


15. 
16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 
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If (bn) is a bounded sequence and lim(a„) = 0, show that lim(a,b,) = 0. Explain why 
Theorem 3.2.3 cannot be used. 


Explain why the result in equation (3) before Theorem 3.2.4 cannot be used to evaluate the limit 
of the sequence ((1 + 1/n)"). 


Let y, := Vn + 1 — yn forn € N. Show that (,/ny,) converges. Find the limit. 
Determine the limits of the following sequences. 

(a) (V4n® +n—2n), (b) (vi? +5n—n). 
Determine the following limits. 
(a) tim((3Vn)'"”"), 


(b) tim( (n + 1)'""*?), 
If 0< a< b, determine im( 


qth + pt! 
a” + b" 
If a > 0,b > 0, show that lim ( (n+a)(n+ b) n) = (a + b) /2. 


Use the Squeeze Theorem 3.2.7 to determine the limits of the following, 

(a) (n°), o) (yi). 

Show that if zn := (a” + b")!” where 0 < a < b, then lim(z,) = b. 

Apply Theorem 3.2.11 to the following sequences, where a, b satisfy O < a < 1,b > 1. 

(a) (a"), (b) (6"/2"), 

(©) (n/b"), (a) (2"/3""). 

(a) Give an example of a convergent sequence (x,,) of positive numbers with lim(x,41/X,) = 1. 


(b) Give an example of a divergent sequence with this property. (Thus, this property cannot be 
used as a test for convergence.) 


Let X = (x,) be a sequence of positive real numbers such that lim(x;,41/X») = L > 1. Show that 
X is not a bounded sequence and hence is not convergent. 


Discuss the convergence of the following sequences, where a, b satisfy 0 <<a<1,b>1. 
(a) (n?a"), b) (b"/n’), 
(c) (b"/n!), (d) (n!/n"). 


Let (x,,) be a sequence of positive real numbers such that lim(x}/") = L < 1. Show that there 
exists a number r with 0 < r < 1 such that 0 < x, < r” for all sufficiently large n € N. Use this 
to show that lim(x,) = 0. 


(a) Give an example of a convergent sequence (x,,) of positive numbers with lim(x!/ i 


= a 


(b) Give an example of a divergent sequence (x,) of positive numbers with lim(x}/ t= 
(Thus, this property cannot be used as a test for convergence.) 


Suppose that (x, ) is a convergent sequence and (y,,) is such that for any ¢ > 0 there exists M such 
that |x, — y,| < € for all n > M. Does it follow that (y„) is convergent? 


Show that if (x„) and (y,,) are convergent sequences, then the sequences (u,,) and (v,,) defined by 
Un := MaX { Xn, Yn} and vn := min{ xn, yp} are also convergent. (See Exercise 2.2.18.) 


Show that if (xn), (y,),; (Zn) are convergent sequences, then the sequence (w,) defined by 
Wy = mid{ Xn, Yp, Zn} is also convergent. (See Exercise 2.2.19.) 


Section 3.3 Monotone Sequences 


Until now, we have obtained several methods of showing that a sequence X = (x,) of real 
numbers is convergent: 
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(i) We can use Definition 3.1.3 or Theorem 3.1.5 directly. This is often (but not 
always) difficult to do. 

(ii) We can dominate |x, — x| by a multiple of the terms in a sequence (an) known to 
converge to 0, and employ Theorem 3.1.10. 

(iii) We can identify X as a sequence obtained from other sequences that are known to 
be convergent by taking tails, algebraic combinations, absolute values, or square roots, and 
employ Theorems 3.1.9, 3.2.3, 3.2.9, or 3.2.10. 

(iv) We can “squeeze” X between two sequences that converge to the same limit and 
use Theorem 3.2.7. 


(v) We can use the “ratio test” of Theorem 3.2.11. 


Except for (iii), all of these methods require that we already know (or at least suspect) the 
value of the limit, and we then verify that our suspicion is correct. 

There are many instances, however, in which there is no obvious candidate for the limit 
of a sequence, even though a preliminary analysis may suggest that convergence is likely. 
In this and the next two sections, we shall establish results that can be used to show a 
sequence is convergent even though the value of the limit is not known. The method we 
introduce in this section is more restricted in scope than the methods we give in the next 
two, but it is much easier to employ. It applies to sequences that are monotone in the 
following sense. 


3.3.1 Definition Let X = (x,,) be a sequence of real numbers. We say that X is increasing 
if it satisfies the inequalities 


Xp SX Se Sy SX See. 


We say that X is decreasing if it satisfies the inequalities 


XAND SX Se Xni. 


We say that X is monotone if it is either increasing or decreasing. 


The following sequences are increasing: 
(1,2,3,4,...,”,-..), (1,2,2,3,3,3,...), 
(a,@,@,...,a,...) if a>l. 
The following sequences are decreasing: 
(1,1/2,1/3,...,1/n,..), (1,1/2, 1/22,...,1/271,...), 
(REE aD if 0<b<1. 


The following sequences are not monotone: 


(+1,=1,+1,...,(-1)""",...), (=1,4+2,-3,...,(-D"...) 


The following sequences are not monotone, but they are “ultimately” monotone: 


(7,6,2,1,2,3,4,...), (2,0 1,1/2, TAA ue) 


3.3.2 Monotone Convergence Theorem A monotone sequence of real numbers is 
convergent if and only if it is bounded. Further: 
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(a) If X = (x,) is a bounded increasing sequence, then 
lim(x,) = sup{x, : n € N}. 

(b) IFY = (yn) is a bounded decreasing sequence, then 
lim(y„) = inf{y, : n € N}. 


Proof. It was seen in Theorem 3.2.2 that a convergent sequence must be bounded. 

Conversely, let X be a bounded monotone sequence. Then X is either increasing or 
decreasing. 

(a) We first treat the case where X = (x,,) is a bounded, increasing sequence. Since X is 
bounded, there exists a real number M such that x, < M for all n € N. According to the 
Completeness Property 2.3.6, the supremum x* = sup{x, : n € N} exists in R; we will 
show that x* = lim(x,). 

If € > 0 is given, then x* — ¢ is not an upper bound of the set {x, : n € N}, and hence 
there exists xg such that x* — € < xx. The fact that X is an increasing sequence implies that 
Xg < Xn whenever n > K, so that 


x*— e< xg < Xn <x <x“ +e forall n> K. 
Therefore we have 
Ix, —x*|<e foral n>K. 
Since € > 0 is arbitrary, we conclude that (x,) converges to x*. 

(b) If Y = (yn) is a bounded decreasing sequence, then it is clear that X := —Y = 
(—y,) is a bounded increasing sequence. It was shown in part (a) that lim X = 
sup{—y, : n € N}, Now lim X = —lim Y and also, by Exercise 2.4.4(b), we have 

sup{—y, : n E N} = —inf{y, :n E N}. 
Therefore lim Y = —lim X = inf{y, : n € N}. Q.E.D. 


The Monotone Convergence Theorem establishes the existence of the limit of a 
bounded monotone sequence. It also gives us a way of calculating the limit of the sequence 
provided we can evaluate the supremum in case (a), or the infimum in case (b). Sometimes 
it is difficult to evaluate this supremum (or infimum), but once we know that it exists, it is 
often possible to evaluate the limit by other methods. 


3.3.3 Examples (a) lim(1/yn) = 0. 

It is possible to handle this sequence by using Theorem 3.2.10; however, we shall use 
the Monotone Convergence Theorem. Clearly 0 is a lower bound for the set {1/yn : 
n € N}, and it is not difficult to show that 0 is the infimum of the set {1/./n:n € N}, 
hence 0 = lim(1/,/n). 

On the other hand, once we know that X := (1/,/n) is bounded and decreasing, we 
know that it converges to some real number x. Since X = (1/,/n) converges to x, it 
follows from Theorem 3.2.3 that X -X =(1/n) converges to x*. Therefore x? = 0, 
whence x = 0. 

(b) Let hy, := 1+1/24+1/3+---+1/n for n € N. 

Since fy.) = An + 1/(n + 1) > hy, we see that (A,) is an increasing sequence. By the 
Monotone Convergence Theorem 3.3.2, the question of whether the sequence is convergent 
or not is reduced to the question of whether the sequence is bounded or not. Attempts to use 
direct numerical calculations to arrive at a conjecture concerning the possible boundedness 
of the sequence (/,) lead to inconclusive frustration. A computer run will reveal the 
approximate values h, ~ 11.4 for n = 50,000, and h, œ~ 12.1 for n = 100,000. Such 
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numerical facts may lead the casual observer to conclude that the sequence is bounded. 
However, the sequence is in fact divergent, which is established by noting that 


Pe ene ee ea al 
2" 2 3 4 only gn 
34 1 1 1 ba p 

2 (4'4 2” 2” 
ee 
E 2 2 2 
EDEA 

2 


Since (h) is unbounded, Theorem 3.2.2 implies that it is divergent. (This proves that 
the infinite series known as the harmonic series diverges. See Example 3.7.6(b) in 
Section 3.7.) 

The terms h, increase extremely slowly. For example, it can be shown that to achieve 
hy > 50 would entail approximately 5.2 x 107! additions, and a normal computer per- 
forming 400 million additions a second would require more than 400,000 years to perform 
the calculation (there are 31,536,000 seconds in a year). A supercomputer that can perform 
more than a trillion additions a second would take more than 164 years to reach that modest 
goal. And the IBM Roadrunner supercomputer at a speed of a quadrillion operations per 
second would take over a year and a half. 


Sequences that are defined inductively must be treated differently. If such a sequence is 
known to converge, then the value of the limit can sometimes be determined by using the 
inductive relation. 

For example, suppose that convergence has been established for the sequence (x,) 
defined by 


1 
x, = 2, M1=2+—, neEN. 
Xn 
If we let x = lim( xn), then we also have x = lim(x,+1) since the 1-tail (x,+1) converges to 
the same limit. Further, we see that x, > 2, so that x 40 and x, 40 for all n EN. 
Therefore, we may apply the limit theorems for sequences to obtain 
1 1 


b= =2+-. 
lim(x,) T x 


x = lim(xp41) = 2 


Thus, the limit x is a solution of the quadratic equation x? — 2x — 1 = 0, and since x must 
be positive, we find that the limit of the sequence is x = 1 + V2. 

Of course, the issue of convergence must not be ignored or casually assumed. For 
example, if we assumed the sequence (y,,) defined by y, := 1, y,,) := 2y, + 1 is conver- 
gent with limit y, then we would obtain y = 2y + 1, so that y = —1. Of course, this is 
absurd. 

In the following examples, we employ this method of evaluating limits, but only after 
carefully establishing convergence using the Monotone Convergence Theorem. Additional 
examples of this type will be given in Section 3.5. 


3.3.4 Examples (a) Let Y = (y,) be defined inductively by y; := 1, y,.):= $ (2y +3) 
forn > 1. We shall show that lim Y = 3/2. 
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Direct calculation shows that y, = 5/4. Hence we have y; < y, < 2. We show, by 
Induction, that y„ < 2 for all n € N. Indeed, this is true for n = 1, 2. If y, < 2 holds for 
some k € N, then 


View = 4 (2ye +3) <7 (44-3) =F <2, 


so that y,,, < 2. Therefore y, < 2 for all n € N. 

We now show, by Induction, that y, < y,,, for all n € N. The truth of this assertion 
has been verified for n = 1. Now suppose that y < y,,, for some k; then 2y, +3 
< 2y,,; +3, whence it follows that 


Yk+1 = 5 (2y, +3) < (yu + 3) = Yuya 


Thus y% < y,,; implies that y,,,; <y,,». Therefore y, <y,,, for all n € N. 

We have shown that the sequence Y = (y,,) is increasing and bounded above by 2. It 
follows from the Monotone Convergence Theorem that Y converges to a limit that is at 
most 2. In this case it is not so easy to evaluate lim(y,,) by calculating sup{y, : n € N}. 
However, there is another way to evaluate its limit. Since y,,,, = £ (2y, + 3) for all n € N, 
the nth term in the 1-tail Y; of Y has a simple algebraic relation to the nth term of Y. Since, 
by Theorem 3.1.9, we have y := lim Y; = lim Y, it therefore follows from Theorem 3.2.3 
(why?) that 


y =4(2y +3), 


from which it follows that y = 3/2. 
(b) Let Z = (z,) be the sequence of real numbers defined by zı := 1, 2,4, := V/2Z, for 
n € N. We will show that lim(z,,) = 2. 

Note that zı = 1 and z2 = V2; hence 1 < zı < z2 < 2. We claim that the sequence Z 
is increasing and bounded above by 2. To show this we will show, by Induction, that 
1 < Zn < Zn41 < 2 for all n € N. This fact has been verified for n = 1. Suppose that it is 
true for n = k; then 2 < 2z, < 2z,4, < 4, whence it follows (why?) that 


1< V2 < ze = 22 < Zk = V 2241 < V4 =2. 


[In this last step we have used Example 2.1.13(a).] Hence the validity of the inequality 
1 < Zk < Zk}ı < 2 implies the validity of 1 < Zķ}1 < Zp42 < 2. Therefore 1 < z, < 
Zn+1 < 2 for all n € N. 

Since Z = (Z„) is a bounded increasing sequence, it follows from the Monotone 
Convergence Theorem that it converges to a number z := sup{z,}. It may be shown 
directly that sup{z, } = 2, so that z = 2. Alternatively we may use the method employed in 
part (a). The relation z,.; = ./2Z, gives a relation between the nth term of the 1-tail Z, of Z 
and the nth term of Z. By Theorem 3.1.9, we have lim Z; = z = lim Z. Moreover, by 
Theorems 3.2.3 and 3.2.10, it follows that the limit z must satisfy the relation 


z = V2z. 


Hence z must satisfy the equation z? = 2z, which has the roots z = 0, 2. Since the terms of 
Z= (Zn) all satisfy 1 <z, < 2, it follows from Theorem 3.2.6 that we must have 


1 <z < 2. Therefore z = 2. 
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The Calculation of Square Roots 


We now give an application of the Monotone Convergence Theorem to the calculation of 
square roots of positive numbers. 


3.3.5 Example Let a > 0; we will construct a sequence (s,) of real numbers that 
converges to y/a. 

Let sı > 0 be arbitrary and define 5,4) := 5 (Sn + a/s,) for n € N. We now show that 
the sequence (s,,) converges to \/a. (This process for calculating square roots was known in 
Mesopotamia before 1500 B.c.) 

We first show that s >a for n> 2. Since s, satisfies the quadratic equation 
A — 2Sn+1Sn + a = 0, this equation has a real root. Hence the discriminant 4s? +17 4a 
must be nonnegative; that is, s2 a 2afon>l. 

To see that (s,) is ultimately decreasing, we note that for n > 2 we have 


1 a 1 (s?-—a) 
Sn — ðn+1 = ðn Sn =- >0. 
: sa as 2 (s T 5) 2 Sn 


Hence, 5,4; < Sn for all n > 2. The Monotone Convergence Theorem implies that s := 
lim(s„) exists. Moreover, from Theorem 3.2.3, the limit s must satisfy the relation 


1 +‘) 
S=—(s+-— 
2 sl}? 


whence it follows (why?) that s = a/s or s* = a. Thus s = ya. 

For the purposes of calculation, it is often important to have an estimate of how rapidly 
the sequence (s,,) converges to \/a. As above, we have \/a < sn for all n > 2, whence it 
follows that a/s, < v/a < Sn. Thus we have 


0 < Sn— Va < 5, — a/s = ($ —a)/s, for n>2. 


Using this inequality we can calculate \/a to any desired degree of accuracy. 


Euler’s Number 


We conclude this section by introducing a sequence that converges to one of the 
most important ‘“‘transcendental’’ numbers in mathematics, second in importance only 
to T. 


3.3.6 Example Lete, := (1 + 1/n)" for n € N. We will now show that the sequence E = 
(en) is bounded and increasing; hence it is convergent. The limit of this sequence is the 
famous Euler number e, whose approximate value is 2.718 281 828 459 045 ..., which is 
taken as the base of the “natural” logarithm. 

If we apply the Binomial Theorem, we have 


pE (rt) nats ma=1) 1 a(n 1)(-2) 1 
n lon 


2! n2 3! n? 
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If we divide the powers of n into the terms in the numerators of the binomial coefficients, 
we get 
1 1 1 1 2 
én =14+1 1 1 1 
2! n 3! n n 


Similarly we have 


1 1 1 1 2 

gaiti 1 > {1 1 
A =( =) =( a) ( =) 

1 as 
rv ei Oy ee a ee ec 

n! n+1 n+1 n+l 
1 l 1 1 2 mil he: n 
(n+ 1)! n+1 n+1 n+1 


Note that the expression for e,, contains n + 1 terms, while that for e,,; contains n + 2 
terms. Moreover, each term appearing in e, is less than or equal to the corresponding term 
in @,41, and e,,; has one more positive term. Therefore we have 2 < e} < e2 < < 
en < en+1 < +++, SO that the terms of E are increasing. 

To show that the terms of E are bounded above, we note that if p = 1, 2,...,n, then 
(1 — p/n) < 1. Moreover 2?~! < p! [see 1.2.4(e)] so that 1/p! < 1/277}. Therefore, if n > 
1, then we have 


| | 1 | 1 | | 1 
BS Genie E 
Since it can be verified that [see 1.2.4(f)] 
l H : H + : =1 : <1 
2 l 22 l gn-l z gn-l ? 


we deduce that 2 < e, < 3 for all n € N. The Monotone Convergence Theorem implies 
that the sequence E converges to a real number that is between 2 and 3. We define the 
number e to be the limit of this sequence. 

By refining our estimates we can find closer rational approximations to e, but we 
cannot evaluate it exactly, since e is an irrational number. However, it is possible to 
calculate e to as many decimal places as desired. The reader should use a calculator (or a 
computer) to evaluate e, for “large” values of n. 


Leonhard Euler 

Leonhard Euler (1707-1783) was born near Basel, Switzerland. His 
clergyman father hoped his son would follow him into the ministry, 
but when Euler entered the University of Basel at age 14, he studied 
medicine, physics, astronomy, and mathematics as well as theology. His 
mathematical talent was noticed by Johann Bernoulli, who became his 
mentor. In 1727, Euler traveled to Russia to join Bernoulli’s son, Daniel, 
at the new St. Petersburg Academy. There he met and married Katharina 
Gsell, the daughter of a Swiss artist. During their 40-year marriage, they 
had 13 children, but only five survived childhood. 
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In 1741, Euler accepted an offer from Frederick the Great to join the Berlin Academy, where 
he stayed for 25 years. During this period he wrote landmark books on a relatively new subject 
called calculus and a steady stream of papers on mathematics and science. In response to a request 
for instruction in science from the Princess of Anhalt-Dessau, he wrote her nearly 200 letters on 
science that later became famous in a book titled Letters to a German Princess. When Euler lost 
vision in one eye, Frederick thereafter referred to him as his mathematical “cyclops.” 

In 1766, he happily returned to Russia at the invitation of Catherine the Great. His eyesight 
continued to deteriorate and in 1771 he became totally blind following an eye operation. 
Incredibly, his blindness made little impact on his mathematics output, for he wrote several 
books and over 400 papers while blind. He remained active until the day of his death. 

Euler’s productivity was remarkable. He wrote textbooks on physics, algebra, calculus, real 
and complex analysis, and differential geometry. He also wrote hundreds of papers, many winning 
prizes. A current edition of his collected works consists of 74 volumes. 


Exercises for Section 3.3 


1. Let x; := 8andx,44 := $ Xn + 2 for n € N. Show that (x,,) is bounded and monotone. Find the 
limit. 
2. Let x) > land x,41 := 2 — 1/x, for n € N. show that (xn) is bounded and monotone. Find the 
limit. 
3. Let xı > 2and xy41 := 1+ Vx, — 1 for n € N. Show that (x,) is decreasing and bounded 
below by 2. Find the limit. 


4. Let xı := Land X41 := V2 + Xn for n € N. Show that (x,) converges and find the limit. 


Let y; := yp, where p > 0, and y,,, := yP F Yn for n € N. Show that (y,,) converges and find 
the limit. [Hint: One upper bound is 1 + 2,/p.] 

6. Leta > Oand let z; > 0. Define 2,41 := v/a + Zn for n € N. Show that (z,,) converges and find 
the limit. 

7. Letx; := a > Oand xn41 := Xn + 1/x, for n € N. Determine whether (xn) converges or diverges. 


Let (an) be an increasing sequence, (b,,) be a decreasing sequence, and assume that a, < bẹ for 
all n € N. Show that lim(a,) < lim(d,), and thereby deduce the Nested Intervals Property 2.5.2 
from the Monotone Convergence Theorem 3.3.2. 


9. Let A be an infinite subset of R that is bounded above and let u := sup A. Show there exists an 
increasing sequence (x) with x, € A for all n € N such that u = lim(x,). 


10. Establish the convergence or the divergence of the sequence (y,,), where 


1 1 1 
: | Hod fi N. 
mapi nt nag S 
11. Letx, := 1/1? + 1/2? +- -- + 1/n? for each n € N. Prove that (x,) is increasing and bounded, 


and hence converges. [Hint: Note that if k > 2, then 1/k? < 1/k(k — 1) = 1/(k — 1) — 1/k] 


12. Establish the convergence and find the limits of the following sequences. 


@ ((1+1/n)"*"), © (C+ 1/7”), 
©) ((1+-4) i; (@) ((1—1/m)"). 


13. Use the method in Example 3.3.5 to calculate V2, correct to within 4 decimals. 


14. Use the method in Example 3.3.5 to calculate V5, correct to within 5 decimals. 
15. Calculate the number e, in Example 3.3.6 for n = 2, 4, 8, 16. 
16. Use a calculator to compute e, for n = 50, n = 100, and n = 1000. 
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Section 3.4 Subsequences and the Bolzano-Weierstrass Theorem 


In this section we will introduce the notion of a subsequence of a sequence of real numbers. 
Informally, a subsequence of a sequence is a selection of terms from the given sequence 
such that the selected terms form a new sequence. Usually the selection is made for a 
definite purpose. For example, subsequences are often useful in establishing the conver- 
gence or the divergence of the sequence. We will also prove the important existence 
theorem known as the Bolzano-Weierstrass Theorem, which will be used to establish a 
number of significant results. 


3.4.1 Definition Let X = (x,,) be a sequence of real numbers and let ny < m <--: < 
nk <-+-++ be a strictly increasing sequence of natural numbers. Then the sequence X’ = 
(Xn,) given by 


O Aan tree 


is called a subsequence of X. 


For example, if X := (+,4,4,...), then the selection of even indexed terms produces 


the subsequence 
y= 111 1 
== P E 2k? ’ 


where 1, =2, m =4,...,m = 2k,.... Other subsequences of X =(1/n) are the 


following: 
111 1 1 1 1 1 
1°35 2k10 24L 6L 7 (2 
The following sequences are not subsequences of X = (1/n): 
111111 1 1 61 
Fas ai ahi eats S S E O 25%] 
EER I G 3? I5? x ) 


A tail of a sequence (see 3.1.8) is a special type of subsequence. In fact, the m-tail 
corresponds to the sequence of indices 


n =m+1,m =m+2,... nk =mMtk,.... 


But, clearly, not every subsequence of a given sequence need be a tail of the sequence. 
Subsequences of convergent sequences also converge to the same limit, as we now show. 


3.4.2 Theorem Jf a sequence X = (xn) of real numbers converges to a real number x, 
then any subsequence X' = (Xn,) of X also converges to x. 


Proof. Lete > 0 be given and let K (e) be such that if n > K(e), then |x, — x| < €. Since 
nı <m <- < mg <- -is an increasing sequence of natural numbers, it is easily proved 
(by Induction) that nk > k. Hence, if k > K(e), we also have ng > k > K(e) so that 
|Xn, — x| < £. Therefore the subsequence (xn,) also converges to x. Q.E.D. 


3.4.3 Examples (a) lim(b”)=0if0<b<1. 
We have already seen, in Example 3.1.11(b), that if 0 < b < 1 and if x, := b”, then it 
follows from Bernoulli’s Inequality that lim(x,,) = 0. Alternatively, we see that since 
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0< b< 1, then x,,; =b"*! < b” = x, so that the sequence (Xn) is decreasing. It is also 
clear that 0 < x, < 1, so it follows from the Monotone Convergence Theorem 3.3.2 that 
the sequence is convergent. Let x := lim xn. Since (x2,) is a subsequence of (xn) it follows 
from Theorem 3.4.2 that x = lim(x2,). Moreover, it follows from the relation x2, = b” = 
("Y = x? and Theorem 3.2.3 that 


x = lim(x2,) = (lim(x,))” = x’. 
Therefore we must have either x = 0 or x = 1. Since the sequence (xn) is decreasing and 
bounded above by b < 1, we deduce that x = 0. 
(b) lim(c!/") = 1 fore > 1. 

This limit has been obtained in Example 3.1.11(c) for c > 0, using a rather ingenious 
argument. We give here an alternative approach for the case c > 1. Note that if z, := c!/”, 
then Zn > 1 and Zn+1ı < Zn for all n € N. (Why?) Thus by the Monotone Convergence 
Theorem, the limit z := lim(z,) exists. By Theorem 3.4.2, it follows that z = lim(z2,). In 
addition, it follows from the relation 


Zan = o/22" — (e1)! — zi/2 
and Theorem 3.2.10 that 


; ; 1/2 
z = lim(zo,) = (lim(z,)) [2 = z2, 
Therefore we have z? = z whence it follows that either z = 0 or z = 1. Since z, > 1 for all 
n € N, we deduce that z = 1. 
We leave it as an exercise to the reader to consider the case 0 < c < 1. 


The following result is based on a careful negation of the definition of lim(x,) = x. It 
leads to a convenient way to establish the divergence of a sequence. 


3.4.4 Theorem Let X = (xn) be a sequence of real numbers, Then the following are 
equivalent: 


(i) The sequence X = (x,) does not converge tox ER. 

(ii) There exists an & >0 such that for any k € N, there exists ng E N such that 
nk > kand |Xn, — x| > £0. 

Gii) There exists an & > 0 and a subsequence X' = (Xn, ) of X such that |Xn, — x| > £o for 
all k € N. 


Proof. (i) = (ii) If (x,) does not converge to x, then for some £ọ > 0 it is impossible to 
find a natural number k such that for all n > k the terms x, satisfy |x, — x| < &. That is, for 
each k € N it is not true that for alln > k the inequality |x, — x| < £o holds. In other 
words, for each k € N there exists a natural number nz > k such that |x, — x| > £o. 

(ii) = (iii) Let gp be as in (ii) and let n; € N be such that n; > 1 and |x,, — x| > &. 
Now let m € N be such that m > n and|x,, — x| > £o; let n3 EN be such that 
nz > m and |X,, — x| > £o. Continue in this way to obtain a subsequence X’ = (xp,) of 
X such that |x, — x| > £o for all k € N. 

(iti) > (i) Suppose X = (xn) has a subsequence X’ = (xn,) satisfying the condition 
in (iii). Then X cannot converge to x; for if it did, then, by Theorem 3.4.2, the subsequence 
X’ would also converge to x. But this is impossible, since none of the terms of X’ belongs to 
the €9-neighborhood of x. Q.E.D. 
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Since all subsequences of a convergent sequence must converge to the same limit, we 
have part (i) in the following result. Part (ii) follows from the fact that a convergent 
sequence is bounded. 


3.4.5 Divergence Criteria Jf a sequence X = (x„) of real numbers has either of the 
following properties, then X is divergent. 


(i) X has two convergent subsequences X' = (Xn,) and X" = (x;,) whose limits are not 
equal. 
(ii) X is unbounded. 


3.4.6 Examples (a) The sequence X := ((—1)”) is divergent. 

The subsequence X’ := ((—1)*”) = (1, 1, ...) converges to 1, and the subsequence 
X" := ((-1)*""') = (—1,—1, ...) converges to —1. Therefore, we conclude from Theo- 
rem 3.4.5(i) that X is divergent. 

(b) The sequence (1,4,3,4,...) is divergent. 

This is the sequence Y = (y,), where y, = n if n is odd, and y, = 1/n if n is even. It 
can easily be seen that Y is not bounded. Hence, by Theorem 3.4.5(ii), the sequence is 
divergent. 

(c) The sequence S := (sin n) is divergent. 

This sequence is not so easy to handle. In discussing it we must, of course, make use of 
elementary properties of the sine function. We recall that sin(2/6) = $ = sin(57/6) and 
that sin x > 45 for x in the interval J, := (2/6, 57/6). Since the length of J; is 
51/6 — 1/6 = 27/3 > 2, there are at least two natural numbers lying inside 71; we let 
nı be the first such number. Similarly, for each k € N, sin x > 5 for x in the interval 


I := (2/6 + 2n(k — 1), 52/6 + 27(k — 1)). 


Since the length of J; is greater than 2, there are at least two natural numbers lying inside Zx; 
we let ny be the first one. The subsequence S’ := (sin ng) of S obtained in this way has the 
property that all of its values lie in the interval l, 1]. 

Similarly, if k € N and J; is the interval 


Jk := (Tn /6 + 2n(k — 1), 117/6 + 2x(k — 1)), 


then it is seen that sin x < — 5 for all x € J; and the length of J; is greater than 2. Let m; be 
the first natural number lying in Jp. Then the subsequence $” := (sin mg) of S has the 
property that all of its values lie in the interval [-1, — 5]. 

Given any real number c, it is readily seen that at least one of the subsequences S’ and 
S” lies entirely outside of the 5-neighborhood of c. Therefore c cannot be a limit of S. Since 
c € R is arbitrary, we deduce that S is divergent. 


The Existence of Monotone Subsequences 


While not every sequence is a monotone sequence, we will now show that every sequence 
has a monotone subsequence. 


3.4.7 Monotone Subsequence Theorem Zf X = (xn) is a sequence of real numbers, 
then there is a subsequence of X that is monotone. 


Proof. For the purpose of this proof, we will say that the mth term x,, is a “peak” if 
Xm > Xn for all n such that n > m. (That is, X, is never exceeded by any term that follows it 
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in the sequence.) Note that, in a decreasing sequence, every term is a peak, while in an 
increasing sequence, no term is a peak. 

We will consider two cases, depending on whether X has infinitely many, or finitely 
many, peaks. 

Case I: X has infinitely many peaks. In this case, we list the peaks by increasing 
subscripts: Xim,,Xim;-++,Xm,;+--- Since each term is a peak, we have 


Xm > Xm 20 Xm. 


Therefore, the subsequence (x,,,) of peaks is a decreasing subsequence of X. 

Case 2: X has a finite number (possibly zero) of peaks. Let these peaks be listed by 
increasing subscripts: Xm, , Xm, - - - , Xm,- Let sı := m, + 1 be the first index beyond the last 
peak. Since xs, is not a peak, there exists s2 > sı such that x5, < Xs. Since xs, is not a peak, 
there exists s3 > s2 such that Xs, < Xs,. Continuing in this way, we obtain an increasing 
subsequence (X5,) of X. Q.E.D. 


It is not difficult to see that a given sequence may have one subsequence that is 
increasing, and another subsequence that is decreasing. 


The Bolzano-Weierstrass Theorem 


We will now use the Monotone Subsequence Theorem to prove the Bolzano-Weierstrass 
Theorem, which states that every bounded sequence has a convergent subsequence. 
Because of the importance of this theorem we will also give a second proof of it based 
on the Nested Interval Property. 


3.4.8 The Bolzano-Weierstrass Theorem A bounded sequence of real numbers has a 
convergent subsequence. 


First Proof. It follows from the Monotone Subsequence Theorem that if X = (xn) is a 
bounded sequence, then it has a subsequence X’ = (x,,) that is monotone. Since this 
subsequence is also bounded, it follows from the Monotone Convergence Theorem 3.3.2 
that the subsequence is convergent. QED. 


Second Proof. Since the set of values {x, : n € N} is bounded, this set is contained in an 
interval I, := [a,b]. We take nı := 1. 

We now bisect J, into two equal subintervals Jí and 7f, and divide the set of indices 
{ne N:n> 1} into two parts: 


Ay := {nE N:n >m, X, Eli}, Bi={nE€N:n >n, Xn E€}. 


If A; is infinite, we take J := Ji and let n, be the smallest natural number in Aj. If A; is a 
finite set, then B, must be infinite, and we take I) := If and let n> be the smallest natural 
number in By. 

We now bisect J) into two equal subintervals Z% and 14, and divide the set 
{n € N:n > ny} into two parts: 


Ar={nEN:n>m,%,€6}, Bo:r={nEN:in>m,xm € B} 


If A> is infinite, we take I3 := Z and let n, be the smallest natural number in A>. If A> is a 
finite set, then B> must be infinite, and we take J; := IX and let n, be the smallest natural 
number in B>. 


82 CHAPTER 3 SEQUENCES AND SERIES 


We continue in this way to obtain a sequence of nested intervals 1; D h 2- D 
Ik D --- anda subsequence (x;,) of X such that x, € Ix for k € N. Since the length of J; is 
equal to (b — a)/ 2*-1 it follows from Theorem 2.5.3 that there is a (unique) common point 
E€ I, for all k € N. Moreover, since x, and £ both belong to i, we have 


Ix — 61 < (b-a)/2°7, 
whence it follows that the subsequence (x,,) of X converges to &. Q.E.D. 


Theorem 3.4.8 is sometimes called the Bolzano-Weierstrass Theorem for sequences, 
because there is another version of it that deals with bounded sets in R (see Exercise 11.2.6). 

It is readily seen that a bounded sequence can have various subsequences that converge 
to different limits or even diverge. For example, the sequence ((—1)”) has subsequences 
that converge to —1, other subsequences that converge to +1, and it has subsequences that 
diverge. 

Let X be a sequence of real numbers and let X’ be a subsequence of X. Then X’ is a 
sequence in its own right, and so it has subsequences. We note that if X” is a subsequence of 
X’, then it is also a subsequence of X. 


3.4.9 Theorem Let X = (xn) be a bounded sequence of real numbers and let x € R have 
the property that every convergent subsequence of X converges to x. Then the sequence X 
converges to x. 


Proof. Suppose M > 0 is a bound for the sequence X so that |x,| < M for all n € N. If X 
does not converge to x, then Theorem 3.4.4 implies that there exist & >0O and a 
subsequence X’ = (x,,) of X such that 


(1) |Xn, — x| > £0 forall keN. 


Since X’ is a subsequence of X, the number M is also a bound for X’. Hence the Bolzano- 
Weierstrass Theorem implies that X’ has a convergent subsequence X”. Since X” is also a 
subsequence of X, it converges to x by hypothesis. Thus, its terms ultimately belong to the 
éo-neighborhood of x, contradicting (1). Q.E.D. 


Limit Superior and Limit Inferior 


A bounded sequence of real numbers (x,,) may or may not converge, but we know from the 
Bolzano-Weierstrass Theorem 3.4.8 that there will be a convergent subsequence and possibly 
many convergent subsequences. A real number that is the limit of a subsequence of (xn) 
is called a subsequential limit of (xn). We let S denote the set of all subsequential limits of 
the bounded sequence (x,,). The set S is bounded, because the sequence is bounded. 

For example, if (xn) is defined by x, := (—1)" + 2/n, then the subsequence (x2,) 
converges to 1, and the subsequence (x2,_) converges to — 1. It is easily seen that the set of 
subsequential limits is S = {—1, 1}. Observe that the largest member of the sequence itself 
is x2 = 2, which provides no information concerning the limiting behavior of the sequence. 

An extreme example is given by the set of all rational numbers in the interval [0, 1]. 
The set is denumerable (see Section 1.3) and therefore it can be written as a sequence (r,). 
Then it follows from the Density Theorem 2.4.8 that every number in [0, 1] is a 
subsequential limit of (r„). Thus we have S = (0, 1]. 

A bounded sequence (xn) that diverges will display some form of oscillation. The 
activity is contained in decreasing intervals as follows. The interval [t;, u1], where tı := 
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inf {x : n € N} and wu, := sup{x, : n E€ N}, contains the entire sequence. If for each 
m = 1,2,..., we define fm := inf{x, : n > m} and um := sup{x, : n > m}, the sequences 
(tm) and (um) are monotone and we obtain a nested sequence of intervals [1,,, um] where the 
mth interval contains the m-tail of the sequence. 

The preceding discussion suggests different ways of describing limiting behavior of a 
bounded sequence. Another is to observe that if a real number v has the property that x, > v 
for at most a finite number of values of n, then no subsequence of (x) can converge to a 
limit larger than v because that would require infinitely many terms of the sequence be 
larger than v. In other words, if v has the property that there exists N, such that x, < v forall 
n > N,, then no number larger than v can be a subsequential limit of (x,). 

This observation leads to the following definition of limit superior. The accompanying 
definition of limit inferior is similar. 


3.4.10 Definition Let X = (x,) be a bounded sequence of real numbers. 


(a) The limit superior of (x,,) is the infimum of the set V of v € R such that v < x, for at 
most a finite number of n € N. It is denoted by 


lim sup(x,) or limsupX or lim(x;,). 


(b) The limit inferior of (x,,) is the supremum of the set of w € R such that xm < w for at 
most a finite number of m € N. It is denoted by 


liminf(x,) or liminfX or lim(x,). 


For the concept of limit superior, we now show that the different approaches are equivalent. 


3.4.11 Theorem Jf (xn) is a bounded sequence of real numbers, then the following 
statements for a real number x* are equivalent. 


(a) x* = lim sup(x,). 

(b) Ife > 0, there are at most a finite number of n € N such that x* +€ < xn, but an 
infinite number of n € N such that x* — € < Xp. 

(c) If Um = sup{ xn : n > m}, then x* = inf{um : m E€ N} = lim(u,). 

(d) If S is the set of subsequential limits of (Xn), then x* = sup S. 


Proof. (a) implies (b). If ¢ > 0, then the fact that x* is an infimum implies that there 
exists a v in V such that x* < v < x* + £. Therefore x* also belongs to V, so there can be at 
most a finite number of n € N such that x* + € < xn. On the other hand, x* — € is not in V 
so there are an infinite number of n € N such that x* — € < Xp. 

(b) implies (c). If (b) holds, given ¢ > 0, then for all sufficiently large m we have 
Um < x + €. Therefore, inf {u,, : m € N} < x* + e. Also, since there are an infinite number 
of n € N such that x* — € < x,, then x* — €< um for all m € N and hence x* —e< 
inf{um :m EN}. Since >Q is arbitrary, we conclude that x* = inf{um : m € N}. 
Moreover, since the sequence (um) is monotone decreasing, we have inf (um) = lim(um). 

(c) implies (d). Suppose that X’ = (xXn,) is a convergent subsequence of X = (xn). 
Since nų > k, we have X», < uz and hence lim X’ < lim(u;) = x*. Conversely, there exists 
nı such that u; — 1 < X, < u. Inductively choose n+; > ng such that 


Uk Uk. 


“EFI < Xna 


Since lim (up) = x*, it follows that x* = lim(x,,), and hence x* € S. 
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(d) implies (a). Let w = sup S. If e > 0 is given, then there are at most finitely many n 
with w +€ < xn. Therefore w + € belongs to V and lim sup (x,) < w + £. On the other 
hand, there exists a subsequence of (x,) converging to some number larger than w — £, so 
that w—e is not in V, and hence w -— e < lim sup (x,). Since ¢ > 0 is arbitrary, we 
conclude that w = lim sup (xn). QED. 


As an instructive exercise, the reader should formulate the corresponding theorem for 
the limit inferior of a bounded sequence of real numbers. 


3.4.12 Theorem A bounded sequence (x,,) is convergent if and only if lim sup (x,,) = 
lim inf (xn). 


We leave the proof as an exercise. Other basic properties can also be found in the 
exercises. 


Exercises for Section 3.4 


1. Give an example of an unbounded sequence that has a convergent subsequence. 
2. Use the method of Example 3.4.3(b) to show that if 0 < c < 1, then lim(c!/”) = 1. 


3. Let (f,,) be the Fibonacci sequence of Example 3.1.2(d), and let x, :=f,,)/f,. Given that 
lim(x,) = L exists, determine the value of L. 


4. Show that the following sequences are divergent. 


(a) (1—(-1)"+1/n), (b) (sinnz/4). 
5. Let X = (xn) and Y = (y,) be given sequences, and let the “shuffled” sequence Z = (z,) be 
defined by 2; := X1, Z2 := Y1; -+ , Z2n-1 '= Xn, Zan “= Y,,---- Show that Z is convergent if and 


only if both X and Y are convergent and lim X = lim Y. 


6. Let x, := n!” forn € N. 
(a) Show that x„+1 < Xn if and only if (1 + 1/7)” < n, and infer that the inequality is valid for 
n > 3. (See Example 3.3.6.) Conclude that (x,,) is ultimately decreasing and that x := 
lim(xn) exists. 
(b) Use the fact that the subsequence (x>,,) also converges to x to conclude that x = 1. 


7. Establish the convergence and find the limits of the following sequences: 


(a) (a + 1), ©) ((1+1/2n)"), 
©) (a a: yey"), (@) ((1+2/n)"). 
8. Determine the limits of the following. 
@ (B), © ((1 + 1/2n)™). 
9. Suppose that every subsequence of X = (x,,) has a subsequence that converges to 0. Show that 
lim X = 0. 


10. Let (xn) be a bounded sequence and for each n € N let s, := sup{ xp : k > n} and S := inf{s,}. 
Show that there exists a subsequence of (x,) that converges to S. 


11. Suppose that x, > 0 for all n € N and that lim((—1)"xn) exists. Show that (x,) converges. 


12. Show that if (x,) is unbounded, then there exists a subsequence (x,,) such that 
lim(1/Xn,) = 0. 


13. If x» :=(—1)"/n, find the subsequence of (x,,) that is constructed in the second proof of the 
Bolzano-Weierstrass Theorem 3.4.8, when we take J; := [—1, 1]. 
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14. Let (xn) be a bounded sequence and let s := sup{x, : n E€ N}. Show that if s¢ {xn :n € N}, 
then there is a subsequence of (x,) that converges to s. 


15. Let (J,) be a nested sequence of closed bounded intervals. For each n € N, let x, € I,. Use the 
Bolzano-Weierstrass Theorem to give a proof of the Nested Intervals Property 2.5.2. 


16. Give an example to show that Theorem 3.4.9 fails if the hypothesis that X is a bounded sequence 
is dropped. 


17. Alternate the terms of the sequences (1 + 1/n) and (—1/n) to obtain the sequence (xn) given by 
(2,—1, 3/2, —1/2, 42 5/4, —1/4,...). 
Determine the values of lim sup(x,,) and lim inf(x,). Also find sup{x,} and inf{x,}. 


18. Show that if (x,) is a bounded sequence, then (x,) converges if and only if lim sup(x,) = 
lim inf(x,). 


19. Show that if (x,) and (y„) are bounded sequences, then 
lim sup(x, + y,) < lim sup(x,) + lim sup(y,,). 


Give an example in which the two sides are not equal. 


Section 3.5 The Cauchy Criterion 


The Monotone Convergence Theorem is extraordinarily useful and important, but it has the 
significant drawback that it applies only to sequences that are monotone. It is important for 
us to have a condition implying the convergence of a sequence that does not require us to 
know the value of the limit in advance, and is not restricted to monotone sequences. The 
Cauchy Criterion, which will be established in this section, is such a condition. 


3.5.1 Definition A sequence X = (x,) of real numbers is said to be a Cauchy sequence 
if for every ¢ > 0 there exists a natural number H(e) such that for all natural numbers 
n,m > H(s), the terms Xn, Xm satisfy |x, — Xm| < €. 


The significance of the concept of Cauchy sequence lies in the main theorem of this 
section, which asserts that a sequence of real numbers is convergent if and only if it is a 
Cauchy sequence. This will give us a method of proving a sequence converges without 
knowing the limit of the sequence. 

However, we will first highlight the definition of Cauchy sequence in the following 
examples. 


3.5.2 Examples (a) The sequence (1/1) is a Cauchy sequence. 

If e > 0 is given, we choose a natural number H = H(e) such that H > 2/e. Then if 
m,n > H, we have 1/n < 1/H < ¢/2 and similarly 1/m < ¢/2. Therefore, it follows that 
if m, n > H, then 


+— <i + 


= 6. 
nm 2 2 


1 1 | 1 1 e ¢ 
< 

n m 

Since ¢ > 0 is arbitrary, we conclude that (1/n) is a Cauchy sequence. 


(b) The sequence (1 + (—1)”) is not a Cauchy sequence. 
The negation of the definition of Cauchy sequence is: There exists o > 0 such that for 
every H there exist at least one n > H and at least one m > H such that |x, — Xm| > £o. For 
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the terms x, := 1+ (-1)", we observe that if n is even, then x, = 2 and X41 = 0. If we 
take &) = 2, then for any H we can choose an even number n > H and let m := n + 1 to get 


[i — Xnyı| = 2 = £0. 


We conclude that (x,,) is not a Cauchy sequence. 


Remark We emphasize that to prove a sequence (x,,) is a Cauchy sequence, we may not 
assume a relationship between m and n, since the required inequality |x, — Xm| < € must 
hold for alln,m > H(e). But to prove a sequence is not a Cauchy sequence, we may specify 
a relation between n and m as long as arbitrarily large values of n and m can be chosen so 
that |X, — Xm| > £o. 


Our goal is to show that the Cauchy sequences are precisely the convergent sequences. 
We first prove that a convergent sequence is a Cauchy sequence. 


3.5.3 Lemma JfX = (x,) is a convergent sequence of real numbers, then X is a Cauchy 
sequence. 


Proof. If x := lim X, then given ¢ > O there is a natural number K(¢/2) such that if 
n > K(e/2) then |x, — x| < 6/2. Thus, if H(e) := K(¢/2) and if n,m > H(e), then we 
have 


[Xn — Xml = |(%Xn — x) + (x — Xm)| 
< |x, — x| + |Xm — x| < 6/2 + 6/2 = e. 
Since € > 0 is arbitrary, it follows that (x„) is a Cauchy sequence. Q.E.D. 


In order to establish that a Cauchy sequence is convergent, we will need the following 
result. (See Theorem 3.2.2.) 


3.5.4 Lemma A Cauchy sequence of real numbers is bounded. 


Proof. Let X := (xn) be a Cauchy sequence and let e := 1. If H := H(1) and n > H, 
then |x, — x| < 1. Hence, by the Triangle Inequality, we have |x,| < |xq| + 1 for all 
n > H. If we set 

M := sup{ |x], x2], sey lxu-1|, lxx| T 1}, 


then it follows that |x,| < M for all n € N. Q.E.D. 


We now present the important Cauchy Convergence Criterion. 


3.5.5 Cauchy Convergence Criterion A sequence of real numbers is convergent if and 
only if it is a Cauchy sequence. 


Proof. We have seen, in Lemma 3.5.3, that a convergent sequence is a Cauchy 
sequence. 

Conversely, let X = (xn) be a Cauchy sequence; we will show that X is convergent to 
some real number. First we observe from Lemma 3.5.4 that the sequence X is bounded. 
Therefore, by the Bolzano-Weierstrass Theorem 3.4.8, there is a subsequence X’ = (Xp, ) 
of X that converges to some real number x*. We shall complete the proof by showing that X 
converges to x*. 
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Since X = (x,) is a Cauchy sequence, given ¢ > 0 there is a natural number H(¢/2) 
such that if n, m > H(e/2) then 


(1) Xn — Xm| < 6/2. 


Since the subsequence X’ = (x;,) converges to x*, there is a natural number K > H(e/2) 
belonging to the set {71, n2, ...} such that 


|xg — x*| < €/2. 
Since K > H(e/2), it follows from (1) with m = K that 
|Xn — xg| < €/2 for n> H(e/2). 
Therefore, if n > H(e/2), we have 
Xn — X*| = [n — Xx) + (xr = x)| 
< [Xn — Xx| + [xx — x*| 
< €/2 + €/2 =€. 
Sincee ¢ >Q is arbitrary, we infer that lim(x,) = x*. Therefore the sequence X is 


convergent. Q.E.D. 


We will now give some examples of applications of the Cauchy Criterion. 


3.5.6 Examples (a) Let X = (x,) be defined by 


1 
xi := 1, x:=2, and xXp:= z m2 +Xn-1) for n>2. 
It can be shown by Induction that 1 < x, < 2foralln € N. (Do so.) Some calculation 
shows that the sequence X is not monotone. However, since the terms are formed by 
averaging, it is readily seen that 


1 
|Xn — Xn = m for neN. 
(Prove this by Induction.) Thus, if m > n, we may employ the Triangle Inequality to obtain 


|Xn = Xml < |Xn = Xn+1| T |Xn+1 7 Xn+2| EE ale \Xm—1 = Xml 


we S E. 
onl gn! gin-2 

Sen eee ee ee 
_ gn-l 2 gin-n-l gn-2 ` 


Therefore, given € > 0, if n is chosen so large that 1/2” < ¢/4 and if m > n, then it follows 
that |x, — Xm| < £. Therefore, X is a Cauchy sequence in R. By the Cauchy Criterion 3.5.5 
we infer that the sequence X converges to a number x. 

To evaluate the limit x, we might first “pass to the limit” in the rule of definition 
xy = 5 (Xn-1 + Xn-2) to conclude that x must satisfy the relation x = 5 (x + x), which is 
true, but not informative. Hence we must try something else. 
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Since X converges to x, so does the subsequence X’ with odd indices. By Induction, the 


reader can establish that [see 1.2.4(f)] 


1 1 1 
X2n+1 = Pg Wega ot yma 
= 1 g 1 l 
ei 4" j” 
It follows from this (how?) that x = lim X = lim X’ = 1 +3 3. 


(b) Let Y = (y,,) be the sequence of real numbers given by 


ao, RE. Ard (-1)"" 
Vie gp Pe Gy aye ee. MSp 5g t 
Clearly, Y is not a monotone sequence. However, if m > n, then 
E zji (ai (a1 
Ym M= en hau et m ` 


Since 2’! < r! [see 1.2.4(e)], it follows that if m > n, then (why?) 


[Ym 


OTE SI 1 
M S mD Gee) m! 
1 1 1 1 

= Eyt E mat S nat: 


Therefore, it follows that (y,,) is a Cauchy sequence. Hence it converges to a limit y. At the 
present moment we cannot evaluate y directly; however, passing to the limit (with respect 
to m) in the above inequality, we obtain 


Hence we can calculate y 


Yn =y < 1/2. 
to any desired accuracy by calculating the terms y, for 


sufficiently large n. The reader should do this and show that y is approximately equal 
to 0.632 120 559. (The exact value of y is 1 — 1/e.) 


(c) The sequence 5 


Iot qe di 
—+—+.-.-+-—) diverges. 
1 n 8 


Let H := (h,) be the sequence defined by 


hy: 


1 1 


which was considered in 3.3.3(b). If m > n, then 


1 
Sar for neN, 
1 1 
hm — hy = —— + ag Re 
n+1 m 


Since each of these m — n terms exceeds 1/m, then hm — hy, > (m—n)/m=1—n/m. 
In particular, if m = 2n we have ho, — hn > 5. This shows that H is not a Cauchy sequence 
(why?); therefore H is not a convergent sequence. (In terms that will be introduced in Section 3.7, 


we have just proved that the “harmonic series” 5 1/n is divergent.) 


n=1 


3.5.7 Definition We say that a sequence X = (xn) of real numbers is contractive if there 
exists a constant C, 0 < C < 1, such that 


Xie — Xna < C|Xn+1 — Xl 


for all n € N. The number C is called the constant of the contractive sequence. 
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3.5.8 Theorem Every contractive sequence is a Cauchy sequence, and therefore is 
convergent. 


Proof. If we successively apply the defining condition for a contractive sequence, we can 
work our way back to the beginning of the sequence as follows: 
2 
|Xn+2 — Xni) < ClXny1 — Xn| < Co Xn — xn-1| 
< O|Xp_1 — Xna] <--> < Cx — xil. 


For m > n, we estimate | Xm — x,| by first applying the Triangle Inequality and then using 
the formula for the sum of a geometric progression (see 1.2.4(£)). This gives 


[Xm — Xn| < Xm — Xm 1] + [Xin 1— Xm al) tee + [Xn — Xn| 


< (C0 + C” 4-0 +07!) [x9 — xal 


1 = m—n 
= cl (i) |x2 — xıl 


1 
cr} (=) |x2 — xıl. 


Since 0 < C < 1, we know lim(C”) = 0 [see 3.1.11(b)]. Therefore, we infer that (x,,) is a 
Cauchy sequence. It now follows from the Cauchy Convergence Criterion 3.5.5 that (x,) is 
a convergent sequence. Q.E.D. 


IA 


3.5.9 Example We consider the sequence of Fibonacci fractions x, :=f,,/f,., where 
fi =f2 = land fis) =fy+fn—1. (See Example 3.1.2(d).) The first few terms are 
xı = 1, x2 = 1/2, x3 = 2/3, x4 = 3/5, xs = 5/8, and so on. It is shown that the se- 
quence (x,,) is given inductively by the equation x,.,; = 1/(1 + xn) as follows: 


Savi =, Însa oe 1 1 


ee Snl tfa eas Ín 1X 
n+l 


An induction argument establishes 1/2 < x, < 1 for all n, so that adding 1 and taking 
reciprocals gives us the inequality 1/2 < 1/(1 + xn) < 2/3 for all n. It then follows that 


|Xn — Xn-1| 2 
ee 
(1+ xn)(1+%-1)~ 3 3 


4 
|Xn+1 Xn| = |Xn Xn—1| = 9 \Xn = Xie |; 
Hence, the sequence (x,,) is contractive and therefore converges by Theorem 3.5.8. Passing 
to the limit x = lim(x,), we obtain the equation x = 1/(1 + x), so that x satisfies the 
equation x? +x—1=0. The quadratic formula gives us the positive solution 
x = (—1 + V5)/2 = 0.618034... . 

The reciprocal 1/x = (1 + V5) /2 = 1.618034... is often denoted by the Greek letter p 
and referred to as the Golden Ratio in the history of geometry. In the artistic theory of the ancient 
Greek philosophers, a rectangle having ¢ as the ratio of the longer side to the shorter side is the 
rectangle most pleasing to the eye. The number also has many interesting mathematical 
properties. (A historical discussion of the Golden Ratio can be found on Wikipedia.) 


In the process of calculating the limit of a contractive sequence, it is often very 
important to have an estimate of the error at the nth stage. In the next result we give two 
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such estimates: the first one involves the first two terms in the sequence and n; the second 
one involves the difference x, — Xy_1. 


3.5.10 Corollary If X := (x,) is a contractive sequence with constant C,0< C <1, 
and if x* := lim X, then 


n—1 
O -rls E 


Zg% xıl, 


(ii) |x* Xn < 1 ale Xn—1|- 


Proof. From the preceding proof, if m > n, then |x, — Xn| < (œa — C))|x2 — xıl. 
If we let m — oo in this inequality, we obtain (i). 
To prove (ii), recall that if m > n, then 


[Xm = Xn| < [Xm = Xm-1| sees [net = Xn. 
Since it is readily established, using Induction, that 
k 
|Xn+k = Xn+k—1| Ss C [Xn E Xn—1\, 
we infer that 


|Xm = Xn| < (C maa Cc. + C)|Xn — Xn-1 
< Toe aa Xn-1] 


We now let m — oo in this inequality to obtain assertion (ii). QED. 


3.5.11 Example We are told that the cubic equation x? — 7x + 2 = 0 has a solution 
between 0 and 1 and we wish to approximate this solution. This can be accomplished by 
means of an iteration procedure as follows. We first rewrite the equation as x = (x? + 2)/7 
and use this to define a sequence. We assign to x, an arbitrary value between 0 and 1, and then 
define 


Xni = 45(x3+2) for neN. 


Because 0 < x; < 1, it follows that 0 < x, < 1 for all n € N. (Why?) Moreover, we 
have 


3 3 
Xa+ — Xh 


Xn42 — Xaa = F +2) - Ke + 2)| = i 


g3 
7 


1 
7 XA + Xn+1Xn + x? Xn+1 — Xn Xn+1 — Xn 
Therefore, (x,,) is a contractive sequence and hence there exists r such that lim(x,) = r. If 
we pass to the limit on both sides of the equality x,,; = (x3 +2)/7, we obtain r = 
(7° + 2)/7 and hence r° — 7r + 2 = 0. Thus r is a solution of the equation. 

We can approximate r by choosing xı, and calculating x2,x3,... successively. For 


example, if we take x; = 0.5, we obtain (to nine decimal places): 


x2 = 0.303 571 429, x3 = 0.289 710 830, 
x4 = 0.289 188016, x5 = 0.289 169 244, 
X6 = 0.289 168571, etc. 
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To estimate the accuracy, we note that |x2 — xı| < 0.2. Thus, after n steps it follows from 
Corollary 3.5.10(i) that we are sure that |x* — x,| < 3”7!(7"7? - 20). Thus, when n = 6, we 
are sure that 


|x* — x6| < 3°/(7* - 20) = 243/48 020 < 0.0051. 


Actually the approximation is substantially better than this. In fact, since |x6 — x5| < 
0.000 0005, it follows from 3.5.10(ii) that |x* — x6] < ; |x6 — x5| < 0.000 0004. Hence the 


first five decimal places of xe are correct. 


Exercises for Section 3.5 


12. 


13. 


14. 


Give an example of a bounded sequence that is not a Cauchy sequence. 


Show directly from the definition that the following are Cauchy sequences. 


(a) =) (b) (1454-43). 
n 2! n! 


Show directly from the definition that the following are not Cauchy sequences. 


(a) ((-1)"), (b) (n+), (c) (nn) 


Show directly from the definition that if (x„) and (y,,) are Cauchy sequences, then (x, + yn) and 
(xny,) are Cauchy sequences. 


If x, := y/n, show that (xn) satisfies lim|x,4; — X,| = 0, but that it is not a Cauchy sequence. 


Let p be a given natural number. Give an example of a sequence (x,,) that is not a Cauchy 
sequence, but that satisfies lim|x;,, — x,| = 0. 


Let (x,,) be a Cauchy sequence such that x, is an integer for every n € N. Show that (x,,) is 
ultimately constant. 


Show directly that a bounded, monotone increasing sequence is a Cauchy sequence. 
If 0 <r < land |xn41 — Xn| < r” for alln € N, show that (x,) is a Cauchy sequence. 


If xı < x2 are arbitrary real numbers and x, := $ (Xn-2 + Xn-1) forn > 2, show that (x,) is 
convergent. What is its limit? 


If yı < y2 are arbitrary real numbers and y, := yn +3 yn forn > 2, show that (y,) is 
convergent. What is its limit? 


If xı > Oand Xn41 := (2 + Xn) forn > 1, show that (x„) is a contractive sequence. Find the 
limit. 
If xı := 2 and Xn41 := 2+ 1/x, forn > 1, show that (x,) is a contractive sequence. What is its 
limit? 


The polynomial equation x? — 5x + 1 = 0 has a root r with 0 < r < 1. Use an appropriate 
contractive sequence to calculate r within 1074. 


Section 3.6 Properly Divergent Sequences 


For certain purposes it is convenient to define what is meant for a sequence (xn) of real 
numbers to “tend to too.” 


92 CHAPTER 3 SEQUENCES AND SERIES 


3.6.1 Definition Let (x„) be a sequence of real numbers. 


(i) We say that (x,) tends to +00, and write lim(x,,) = +00, if for every a € R there 
exists a natural number K(q) such that if n > K(a), then x, >a. 

(ii) We say that (x,,) tends to —oo, and write lim(x,) = —co, if for every £ € R there 
exists a natural number K(f) such that if n > K(f), then x, < £. 


We say that (x,) is properly divergent in case we have either lim(x,) = +00 or 
lim(x,) = —oo. 


The reader should realize that we are using the symbols +oo and —oo purely as a 
convenient notation in the above expressions. Results that have been proved in earlier 
sections for conventional limits lim(x,) = L (for L € R) may not remain true when 
lim(xn) = ro. 


3.6.2 Examples (a) lim(n) = +00. 

In fact, if a € R is given, let K(@) be any natural number such that K(a@) > æ. 
(b) lim(n?) = +00. 

If K(q) is a natural number such that K(a) > «œ, and if n > K(a) then we have 
n>n>a. 
(c) Ifc > 1, then lim(c’) = +00. 

Let c= 1 + b, where b > 0. If a € R is given, let K(a) be a natural number such that 
K(a) > a/b. If n > K(a) it follows from Bernoulli’s Inequality that 

c=(1+5)">14+nb>1+a>a. 


Therefore lim(c”) = +00. 


Monotone sequences are particularly simple in regard to their convergence. We have 
seen in the Monotone Convergence Theorem 3.3.2 that a monotone sequence is convergent 
if and only if it is bounded. The next result is a reformulation of that result. 


3.6.3 Theorem A monotone sequence of real numbers is properly divergent if and only if 
it is unbounded. 

(a) If (xn) is an unbounded increasing sequence, then lim(x;) = +00. 

(b) If (xn) is an unbounded decreasing sequence, then lim(x,) = —o0. 


Proof. (a) Suppose that (x,,) is an increasing sequence. We know that if (x,,) is bounded, 
then it is convergent. If (x„) is unbounded, then for any œ € R there exists n(œ) € N such 
that œ < X;(q). But since (x,) is increasing, we have a < x, for all n > n(a). Since a is 
arbitrary, it follows that lim(x,) = +00. 

Part (b) is proved in a similar fashion. Q.E.D. 


The following “comparison theorem” is frequently used in showing that a sequence is 
properly divergent. [In fact, we implicitly used it in Example 3.6.2(c).] 


3.6.4 Theorem Let (x,) and (yn) be two sequences of real numbers and suppose that 


(1) Xn <y, forall neN. 


(a) If lim(x,) = +00, then lim(y,) = +00. 
(b) Jf lim(y,,) = —o0, then lim(x,) = —co. 
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Proof. (a) If lim(x,) = +00, and if a € R is given, then there exists a natural number 
K(a) such that if n > K(q), then æ < xn. In view of (1), it follows that œ < y, for all 
n > K(q). Since a is arbitrary, it follows that lim(y,) = +00. 

The proof of (b) is similar. Q.E.D. 


Remarks (a) Theorem 3.6.4 remains true if condition (1) is ultimately true; that is, if 
there exists m € N such that x, < y, for all n > m. 
(b) If condition (1) of Theorem 3.6.4 holds and if lim(y,,) = +00, it does not follow that 
lim(x;,) = +00. Similarly, if (1) holds and if lim(x,) = —oo, it does not follow that 
lim(y,,) = —oo. In using Theorem 3.6.4 to show that a sequence tends to +00 [respectively, 
—oo] we need to show that the terms of the sequence are ultimately greater [respectively, 
less] than or equal to the corresponding terms of a sequence that is known to tend to +00 
[respectively, —oo]. 

Since it is sometimes difficult to establish an inequality such as (1), the following 
“limit comparison theorem” is often more convenient to use than Theorem 3.6.4. 


3.6.5 Theorem Let (x,) and (y,) be two sequences of positive real numbers and suppose 
that for some L € R,L > 0, we have 


(2) fina) 0 
Then lim(x,) = +00 if and only if lim(y,,) = +00. 
Proof. If (2) holds, there exists K € N such that 


5L<Xn/y,<3L forall n>K. 


Hence we have (4 L) Vn < Xn < (3 L) y, for alln > K. The conclusion now follows from a 
slight modification of Theorem 3.6.4. We leave the details to the reader. Q.E.D. 


The reader can show that the conclusion need not hold if either L = 0 or L = +00. 
However, there are some partial results that can be established in these cases, as will be 
seen in the exercises. 


Exercises for Section 3.6 


1. Show that if (x,,) is an unbounded sequence, then there exists a properly divergent subsequence. 


2. Give examples of properly divergent sequences (x,,) and (y,,) with y, Æ 0 forall n € N such that: 
(a) (x,/y,) is convergent, (b) (x,/y,) is properly divergent. 


Show that if x, > 0 for all n € N, then lim(x,,) = 0 if and only if lim(1/x,) = +00. 


4. Establish the proper divergence of the following sequences. 


(a) (vn), b) (/n +1), 
© (vn-1), (d) (n//n+1). 


Is the sequence (n sin n) properly divergent? 


6. Let (x,) be properly divergent and let (y„) be such that lim(x,y,) belongs to R. Show that (y,,) 
converges to 0. 


7. Let (x,) and (y,) be sequences of positive numbers such that lim(x,,/y,) = 0. 
(a) Show that if lim(x,) = +00, then lim(y,) = +00. 
(b) Show that if (y,) is bounded, then lim(x,) = 0. 
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8. Investigate the convergence or the divergence of the following sequences: 


@ (vn? +2), b) (Vn/(n* + 1), 
(c) (vn? +1/vn), (d) (sin /n). 


9. Let (xn) and (y,) be sequences of positive numbers such that lim(x,/y,,) = +00, 
(a) Show that if lim(y,,) = +00, then lim(x,) = +00. 
(b) Show that if (x,) is bounded, then lim(y,,) = 0. 


10. Show that if lim(a,/n) = L, where L > 0, then lim(a,) = +00. 


Section 3.7 Introduction to Infinite Series 


We will now give a brief introduction to infinite series of real numbers. This is a topic that 
will be discussed in more detail in Chapter 9, but because of its importance, we will 
establish a few results here. These results will be seen to be immediate consequences of 
theorems we have met in this chapter. 

In elementary texts, an infinite series is sometimes “‘defined’’ to be “‘an expression of 
the form” 


(1) Xp trates tx tere. 


However, this “definition” lacks clarity, since there is a priori no particular value that we 
can attach to this array of symbols, which calls for an infinite number of additions to be 
performed. 


3.7.1 Definition If X := (x,) is a sequence in R, then the infinite series (or simply the 
series) generated by X is the sequence S := (sx) defined by 


Sy i= X1 
s = Sp+X. (=X, +X) 


Sk := Sk—1 + Xk (= x, + x2 +--+ + Xx) 


The numbers x, are called the terms of the series and the numbers s+ are called the partial 
sums of this series. If lim S exists, we say that this series is convergent and call this limit 
the sum or the value of this series. If this limit does not exist, we say that the series S is 
divergent. 


It is convenient to use symbols such as 


(2) So (xn) or Xoxa or Xon 


to denote both the infinite series S generated by the sequence X = (x,) and also to denote 
the value lim S, in case this limit exists. Thus the symbols in (2) may be regarded merely 
as a way of exhibiting an infinite series whose convergence or divergence is to be 
investigated. In practice, this double use of these notations does not lead to any confusion, 
provided it is understood that the convergence (or divergence) of the series must be 
established. 
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Just as a sequence may be indexed such that its first element is not x4, but is Xo, or X5 or 
X99, We will denote the series having these numbers as their first element by the symbols 


oo o0 oo 
J Xn Or y Xn Or ò Xn- 
n=5 


n=0 n=99 


It should be noted that when the first term in the series is xy, then the first partial sum is 
denoted by sy. 


Warning The reader should guard against confusing the words “sequence” and “series.” 
In nonmathematical language, these words are interchangeable; however, in mathematics, 
these words are not synonyms. Indeed, a series is a sequence S = (sx) obtained from a given 
sequence X = (x,) according to the special procedure given in Definition 3.7.1. 


3.7.2 Examples (a) Consider the sequence X := Can i where r € R, which generates 
the geometric series: 


(3) Sora ltrt¢ Pte tr ten. 
We will show that if |r| < 1, then this series converges to 1/(1 — r). (See also 


Example 1.2.4(f).) Indeed, if sn := 1 +r + rT? +- -- +r” forn > 0, and if we multiply s, 
by r and subtract the result from s,, we obtain (after some simplification): 


ma-r) =i, 


Therefore, we have 


1 yeti 
Sn er aa ee 
from which it follows that 
1 ren 
Sn < ; 
1—r|~ |l-r| 
Since |r|"*! — 0 when |r| < 1, it follows that the geometric series converges to 1/(1 — r) 
when |r| < 1. 
(b) Consider the series generated by ((-1)")** 5; that is, the series: 
(4) (-1)" = (+1) + (-1) + (41) + (-1)4+--:. 
n=0 


It is easily seen (by Mathematical Induction) that s, = 1 if n > 0 is even and s, = Oif 
n is odd; therefore, the sequence of partial sums is (1, 0, 1, 0,...). Since this sequence is 
not convergent, the series (4) is divergent. 


(c) Consider the series 


oY TENS ee | 
(5) = ps 
2 ari ro 2a pA” 


n=] 
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By a stroke of insight, we note that 
1 1 1 
k(k+1) k k+l 
Hence, on adding these terms from k = 1 to k = n and noting the telescoping that takes 
place, we obtain 


1 1 
Sn == 


1 n+l’ 


whence it follows that s, — 1. Therefore the series (5) converges to 1. 


We now present a very useful and simple necessary condition for the convergence of a 
series. It is far from being sufficient, however. 


3.7.3 The nth Term Test Jf the series X` x, converges, then lim(x;,) = 0. 


Proof. By Definition 3.7.1, the convergence of `x, requires that lim(s,) exists. Since 
Xn = Sn — Sn—1, then lim(x,) = lim(s,) — lim(s,_1) = 0. QED. 


Since the following Cauchy Criterion is precisely a reformulation of Theorem 3.5.5, 
we will omit its proof. 


3.7.4 Cauchy Criterion for Series The series X` xn converges if and only if for every 
e > 0 there exists M(e) € N such that if m > n > M(e), then 


(6) [Sm — Sn| = |Xny1 + Xn42 H + Xml < €. 
The next result, although limited in scope, is of great importance and utility. 


3.7.5 Theorem Let (x, be a sequence of nonnegative real numbers. Then the series XXn 
converges if and only if the sequence S = (sx) of partial sums is bounded. In this case, 


aes = lim(s,) = sup{s, : k € N}. 
n=1 


Proof. Since x, > 0, the sequence S of partial sums is monotone increasing: 


Sp SS St SSR ee, 


By the Monotone Convergence Theorem 3.3.2, the sequence S = (sk) converges if and 
only if it is bounded, in which case its limit equals sup{s;}. Q.E.D. 


3.7.6 Examples (a) The geometric series (3) diverges if |r| > 1. 
This follows from the fact that the terms r” do not approach 0 when |r| > 1. 


< 1 
b) The h i i — di ; 
(b) e harmonic series D iverges 
Since the terms 1/n — 0, we cannot use the nth Term Test 3.7.3 to establish this 
divergence. However, it was seen in Examples 3.3.3(b) and 3.5.6(c) that the sequence (s,,) 
of partial sums is not bounded. Therefore, it follows from Theorem 3.7.5 that the harmonic 
series is divergent. This series is famous for the very slow growth of its partial sums (see the 
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discussion in Example 3.3.3(b)) and also for the variety of proofs of its divergence. Here is 
a proof by contradiction. If we assume the series converges to the number S, then we have 


se (19) (et) +s) ++ Glred) + 
peegel 


1 1 
E E E E E E, 
or ae 


Vv 


= S. 
The contradiction S > S shows the assumption of convergence must be false and the 


harmonic series must diverge. 


Note The harmonic series receives its musical name from the fact that the wavelengths of 
the overtones of a vibrating string are 1/2, 1/3, 1/4,..., of the string’s fundamental 
wavelength. 


[0.0] 
1 
c) The 2-series — is convergent. 
(c) 3 = g 


Since the partial sums are monotone, it suffices (why?) to show that some subsequence 
of (sx) is bounded. If kı := 2! — 1 = 1, then sg, = 1. If ky := 2? — 1 = 3, then 


1 1 1 2 1 
Sky =T 2 32 <lta5l+tz 
and if k3 := 2? — 1 = 7, then we have 


1 1 1 1 4 1 1 
Sky = Sk, 4 gp zZ g'z <m tag <l+5 +50. 


By Mathematical Induction, we find that if kj = QW — 1, then 


o< <14}4 (Pet Qt 


Since the term on the right is a partial sum of a geometric series with r = 1, it is dominated 
by 1/ (1 = 4) = 2, and Theorem 3.7.5 implies that the 2-series converges. 


< 1 

d) The p-seri = h 1. 
(d) e p-series 2 zp converges when p > 

Since the argument is very similar to the special case considered in part (c), we will 
leave some of the details to the reader. As before, if kı := 2! — 1 = 1, then Sk = 1. If 
kz := 2? — 1 = 3, then since 2? < 3P, we have 

o1 1 1 1 2. 1 1 
Sky = -z5 t pt 3p < Top = tT: 

Further, if k3 := 2? — 1, then (how?) it is seen that 


4 
Sky < Skz tp <l+ z] +y x 


Finally, we let r := 1/2?~'; since p > 1, we have O <r <1. Using Mathematical 
Induction, we show that if k; := 2 — 1, then 


Ue da ee E H 


Therefore, Theorem 3.7.5 implies that the p-series converges when p > 1. 
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< 1 
(e) The p-series a a diverges when 0 < p < 1. 
We will use the elementary inequality n? < n when n € N and O < p < 1. It follows that 
11 
—~<— for neN. 
n~ mn 
Since the partial sums of the harmonic series are not bounded, this inequality shows that the 
partial sums of the p-series are not bounded when 0 < p < 1. Hence the p-series diverges 
for these values of p. 


(f) The alternating harmonic series, given by 


foe) ZR 1 1 1 (ah 
(7) ( = us : 
3 n 1 aT 3 i n i 


is convergent. 

The reader should compare this series with the harmonic series in (b), which is 
divergent. Thus, the subtraction of some of the terms in (7) is essential if this series is to 
converge. Since we have 


oe PN ee 1 
PN TD 3 4 2n—1 2n)’ 


it is clear that the “even” subsequence (s2n) is increasing. Similarly, the “odd” subse- 
quence (52,41) is decreasing since 

1 1 

2n 2nt+1)° 


1 1 1 1 1 
Sont1 = 
aD he Pe 28 4 5 
Since 0 < Say < Sa, + 1/(2n + 1) = Sony1 < 1, both of these subsequences are bounded 
below by 0 and above by 1. Therefore they are both convergent and to the same value. Thus 


the sequence (s,,) of partial sums converges, proving that the alternating harmonic series (7) 
converges. (It is far from obvious that the limit of this series is equal to In 2.) 


Comparison Tests 


Our first test shows that if the terms of a nonnegative series are dominated by the 
corresponding terms of a convergent series, then the first series is convergent. 


3.7.7 Comparison Test Let X := (x,) and Y := (y,) be real sequences and suppose that 
for some K € N we have 


(8) O<x,<y, for n>K. 


(a) Then the convergence of X` y, implies the convergence of X` xn. 
(b) The divergence of X` x, implies the divergence of S-y,. 


Proof. (a) Suppose that }“y,, converges and, given e > 0, let M(e) € N be such that if 
m >n > M(e), then 


Ypy F nE Ym AE 
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If m > sup{K, M(e)}, then it follows that 
0 < Xnyi te + Xm L Yngi Fe Ym < E, 


from which the convergence of 5 >x, follows. 
(b) This statement is the contrapositive of (a). Q.E.D. 


Since it is sometimes difficult to establish the inequalities (8), the next result is 
frequently very useful. 


3.7.8 Limit Comparison Test Suppose that X := (xn) and Y :=(y,) are strictly 
positive sequences and suppose that the following limit exists in R: 


(9) r := lim (=) : 
Yn 


(a) Ifr 40 then Sx, is convergent if and only if Xy, is convergent. 
(b) [fr = 0 and if X `y, is convergent, then X`xn is convergent. 


Proof. (a) It follows from (9) and Exercise 3.1.18 that there exists K € N such that ir ss 
Xn/Yn < 2r for n > K, whence 


(4r)¥_ < Xn < (2r)y, for n>K. 


If we apply the Comparison Test 3.7.7 twice, we obtain the assertion in (a). 
(b) If r = 0, then there exists K € N such that 


O< Xn <y, for n>K, 
so that Theorem 3.7.7(a) applies. Q.E.D. 
Remark The Comparison Tests 3.7.7 and 3.7.8 depend on having a stock of series that 


one knows to be convergent (or divergent). The reader will find that the p-series is often 
useful for this purpose. 


1 
n2? +n 


3.7.9 Examples (a) The series 5 converges. 
n=1 


It is clear that the inequality 


0< 


1 
A for neN 
n 


n? +n 
is valid. Since the series $` 1/n? is convergent (by Example 3.7.6(c)), we can apply the 
Comparison Test 3.7.7 to obtain the convergence of the given series. 


< 1 
(b) The series > 5 
n 
n=1 


—+——~ is convergent. 
—n+1 & 


If the inequality 
1 1 


aae E O ES 
(10) wm—n+17 n? 
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were true, we could argue as in (a). However, (10) is false for all n € N. The reader can 
probably show that the inequality 1 > 
0 < ———__ << = 
n—n+17 nr 
is valid for all n € N, and this inequality will work just as well. However, it might take 
some experimentation to think of such an inequality and then establish it. 
Instead, if we take x, := 1/(n? — n + 1) andy, := 1/n’, then we have 
Xn n? = 1 
yy W—n+1 1- (1/n) + (1/r?) 
Therefore, the convergence of the given series follows from the Limit Comparison Test 
3.7.8(a). 


= 1 
c) The series —— is divergent. 
©) ea 


This series closely resembles the series $` 1/yn, which is a p-series with p = $; by 
Example 3.7.6(e), it is divergent. If we let x, := 1//n + 1 and y, := 1/,/n, then we have 
a vn 1 i 


Therefore the Limit Comparison Test 3.7.8(a) applies. 


Sa 
(d) The series > = is convergent. 
n! 
n=1 


It would be possible to establish this convergence by showing (by Induction) that 
n? <n'forn > 4, whence it follows that 


1 1 
0<—<— for n>4. 
n! ne 


Alternatively, if we let x := 1/n! andy, := 1/n?, then (when n > 4) we have 


0< Xn n = n z 1 
Sy, n! 1-2---(n—1) “n-2 
Therefore the Limit Comparison Test 3.7.8(b) applies. (Note that this test was a bit 
troublesome to apply since we do not presently know the convergence of any series for 
which the limit of x,/y, is really easy to determine.) 


0. 


Exercises for Section 3.7 


1. Let Y` a, be a given series and let X` b, be the series in which the terms are the same and in the 
same order as in Sa, except that the terms for which a, = 0 have been omitted. Show that 
X` an converges to A if and only if X` b, converges to A. 


2. Show that the convergence of a series is not affected by changing a finite number of its terms. (Of 
course, the value of the sum may be changed.) 


3. By using partial fractions, show that 


1 kas 1 1 : 
(a) Doma OD e z> 0, ifa > 0. 
% 1 


4. If }>x, and >“ y, are convergent, show that X` (x, + y„) is convergent. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 
17. 


18. 
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Can you give an example of a convergent series Y` x, and a divergent series J` y„ such that 
X (Xn + yn) is convergent? Explain. 


[e9] 
(a) Calculate the value of $` (2/7)”. (Note the series starts at n = 2.) 


n=2 


loo) 
(b) Calculate the value of X` (1/3)°”. (Note the series starts at n = 1.) 


n=1 


CO 
Find a formula for the series X- r” when |r| < 1. 
n=1 


Let r1,12,.-.,%m,--. be an enumeration of the rational numbers in the interval [0,1]. 
(See Section 1.3.) For a given e > 0, put an interval of length e” about the nth rational number 
r, for n = 1, 2, 3,..., and find the total sum of the lengths of all the intervals. Evaluate this 


number for € = 0.1 and € = 0.01. 


[oe] 
(a) Show that the series X` cos n is divergent. 
n=1 


CO 
(b) Show that the series X` (cos n)/n? is convergent. 
n=1 
=D" 


E 
a iS 


[6] 
Use an argument similar to that in Example 3.7.6(f) to show that the series X` 
convergent. uel 


If > a, with a, > 0 is convergent, then is X` a? always convergent? Either prove it or give a 
counterexample. 


If X` a, witha, > 0 is convergent, then is X` \/a, always convergent? Either prove it or give a 
counterexample. 


If X` a, witha, > Ois convergent, then is X` ,/a,4,11 always convergent? Either prove it or give 
a counterexample. 


If X a, witha, > 0 is convergent, and if b, := (a; +---+a,)/nforn € N, then show that 
X bn is always divergent. 
Co 


Let X` a(n) be such that (a(n)) is a decreasing sequence of strictly positive numbers. If s(n) 
EREA the nth partial sum, show (by grouping the terms in s(2”) in two different ways) that 
1 (a(1) + 2a(2) +--+ + 2"a(2")) < s(2") < (a(1) + 2a(2) +--+ 27 ta(2"7!)) + a(2”). 
Use these inequalities to show that D a(n) converges if and only if D 2”a(2”) converges. This 
result is often called the Cauchy Contlensation Test; it is very powerful: 

Use the Cauchy Condensation Test to discuss the p-series > (1/n’) for p > 0. 


n=1 
Use the Cauchy Condensation Test to establish the divergence of the series: 


1 1 
(a) 2 sr i (b) Der reat 


© a n(Inn)(InInn)(InInInn) 


Show that if c > 1, then the following series are convergent: 


1 1 
(a) Dar (b) Aoa 


CHAPTER 4 


LIMITS 


“Mathematical analysis” is generally understood to refer to that area of mathematics in 
which systematic use is made of various limiting concepts. In the preceding chapter we 
studied one of these basic limiting concepts: the limit of a sequence of real numbers. In this 
chapter we will encounter the notion of the limit of a function. 

The rudimentary notion of a limiting process emerged in the 1680s as Isaac Newton 
(1642-1727) and Gottfried Leibniz (1646-1716) struggled with the creation of the 
Calculus. Though each person’s work was initially unknown to the other and their creative 
insights were quite different, both realized the need to formulate a notion of function and 
the idea of quantities being “close to” one another. Newton used the word “‘fluent’”’ to 
denote a relationship between variables, and in his major work Principia in 1687 he 
discussed limits “‘to which they approach nearer than by any given difference, but never go 
beyond, nor in effect attain to, till the quantities are diminished in infinitum.’ Leibniz 
introduced the term “function” to indicate a quantity that depended on a variable, and he 
invented ‘‘infinitesimally small” numbers as a way of handling the concept of a limit. The 
term “function” soon became standard terminology, and Leibniz also introduced the term 
“calculus” for this new method of calculation. 

In 1748, Leonhard Euler (1707—1783) published his two-volume treatise Introduc- 
tio in Analysin Infinitorum, in which he discussed power series, the exponential and 
logarithmic functions, the trigonometric functions, and many related topics. This was 
followed by Institutiones Calculi Differentialis in 1755 and the three-volume Institu- 
tiones Calculi Integralis in 1768-1770. These works remained the standard textbooks on 
calculus for many years. But the concept of limit was very intuitive and its looseness 
led to a number of problems. Verbal descriptions of the limit concept were proposed 
by other mathematicians of the era, but none was adequate to provide the basis for 
rigorous proofs. 

In 1821, Augustin-Louis Cauchy (1789-1857) published his lectures on analysis in his 
Cours d’Analyse, which set the standard for mathematical exposition for many years. He 
was concerned with rigor and in many ways raised the level of precision in mathematical 
discourse. He formulated definitions and presented arguments with greater care than his 
predecessors, but the concept of limit still remained elusive. In an early chapter he gave the 
following definition: 


If the successive values attributed to the same variable approach indefinitely a 
fixed value, such that they finally differ from it by as little as one wishes, this latter 
is called the limit of all the others. 


The final steps in formulating a precise definition of limit were taken by Karl 


Weierstrass (1815-1897). He insisted on precise language and rigorous proofs, and his 
definition of limit is the one we use today. 
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Gottfried Leibniz 
Gottfried Wilhelm Leibniz (1646-1716) was born in Leipzig, Germany. 
He was six years old when his father, a professor of philosophy, died and 
left his son the key to his library and a life of books and learning. Leibniz 
entered the University of Leipzig at age 15, graduated at age 17, and 
received a Doctor of Law degree from the University of Altdorf four years 
later. He wrote on legal matters, but was more interested in philosophy. He 
also developed original theories about language and the nature of the 
universe. In 1672, he went to Paris as a diplomat for four years. While 
there he began to study mathematics with the Dutch mathematician Christiaan Huygens. His 
travels to London to visit the Royal Academy further stimulated his interest in mathematics. His 
background in philosophy led him to very original, though not always rigorous, results. 
Unaware of Newtons’s unpublished work, Leibniz published papers in the 1680s that 
presented a method of finding areas that is known today as the Fundamental Theorem of Calculus. 
He coined the term “calculus” and invented the dy/dx and elongated S notations that are used 
today. Unfortunately, some followers of Newton accused Leibniz of plagiarism, resulting in a 
dispute that lasted until Leibniz’s death. Their approaches to calculus were quite different and it is 
now evident that their discoveries were made independently. Leibniz is now renowned for his 
work in philosophy, but his mathematical fame rests on his creation of the calculus. 


Section 4.1 Limits of Functions 


In this section we will introduce the important notion of the limit of a function. The 
intuitive idea of the function f having a limit L at the point c is that the values f(x) are close 
to L when x is close to (but different from) c. But it is necessary to have a technical way of 
working with the idea of “close to” and this is accomplished in the ¢-6 definition given 
below. 

In order for the idea of the limit of a function f at a point c to be meaningful, it is 
necessary that f be defined at points near c. It need not be defined at the point c, but it should 
be defined at enough points close to c to make the study interesting. This is the reason for 
the following definition. 


4.1.1 Definition LetA C R.A point c € Risa cluster point of A if for every ô > 0 there 
exists at least one point x € A, x Æ c such that |x — c| < 6. 


This definition is rephrased in the language of neighborhoods as follows: A point c is a 
cluster point of the set A if every 5-neighborhood V;(c) = (c — ô, c + ô) of c contains at 
least one point of A distinct from c. 


Note The point c may or may not be a member of A, but even if it is in A, it is ignored 
when deciding whether it is a cluster point of A or not, since we explicitly require that there 
be points in V3(c) MA distinct from c in order for c to be a cluster point of A. 


For example, if A := {1, 2}, then the point 1 is not a cluster point of A, since choosing 
ô := 5 gives a neighborhood of | that contains no points of A distinct from 1. The same is 
true for the point 2, so we see that A has no cluster points. 
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4.1.2 Theorem A number c € R is a cluster point of a subset A of R if and only if there 
exists a sequence (d,) in A such that lim(a,) = cand a, # c for all n EN. 


Proof. If c is a cluster point of A, then for any n € N the (1/n)-neighborhood V/,(c) 
contains at least one point a, in A distinct from c. Then a, € A, a, £ c, and |a, — c| < 1/n 
implies lim(a,) = c. 

Conversely, if there exists a sequence (a,) in A\{c} with lim(a,) = c, then 
for any 5>0 there exists K such that if n > K, then a, € Vs(c). Therefore the 
6-neighborhood Vs(c) of c contains the points an, for n > K, which belong to A and 
are distinct from c. Q.E.D. 


The next examples emphasize that a cluster point of a set may or may not belong to 
the set. 


4.1.3 Examples (a) For the open interval A; := (0, 1), every point of the closed 
interval [0,1] is a cluster point of A,. Note that the points 0, | are cluster points of A, 
but do not belong to A;. All the points of A; are cluster points of A4. 

(b) A finite set has no cluster points. 

(c) The infinite set N has no cluster points. 

(d) The set Ay := {1/n:n € N} has only the point 0 as a cluster point. None of the points 
in Ay is a cluster point of A4. 

(e) IfZ := (0, 1], then the setAs := Z N Q consists of all the rational numbers in Z. It follows 
from the Density Theorem 2.4.8 that every point in Z is a cluster point of As. 


Having made this brief detour, we now return to the concept of the limit of a function at 
a cluster point of its domain. 


The Definition of the Limit 


We now state the precise definition of the limit of a function fat a point c. It is important to 
note that in this definition, it is immaterial whether fis defined at c or not. In any case, we 
exclude c from consideration in the determination of the limit. 


4.1.4 Definition LetA C R, and let c be a cluster point of A. For a function f : A — R, a 
real number L is said to be a limit of f at c if, given any ¢ > 0, there exists a ô > 0 such that 
if x € A and 0 < |x — c| < ô, then | f(x) —L| < e. 


Remarks (a) Since the value of 5 usually depends on €, we will sometimes write 5(¢) 
instead of ô to emphasize this dependence. 


(b) The inequality 0 < |x — c| is equivalent to saying x Æ c. 
If L is a limit of f at c, then we also say that f converges to L at c. We often write 
L= lim f (x) or L= limf. 
We also say that “‘f (x) approaches L as x approaches c.” (But it should be noted that the 
points do not actually move anywhere.) The symbolism 
f(x) >L as x>c 


is also used sometimes to express the fact that f has limit L at c. 
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If the limit of f at c does not exist, we say that f diverges at c. 
Our first result is that the value L of the limit is uniquely determined. This uniqueness 
is not part of the definition of limit, but must be deduced. 


4.1.5 Theorem /f f : A — R and if c is a cluster point of A, then f can have only one 
limit at c. 


Proof. Suppose that numbers L and L’ satisfy Definition 4.1.4. For any € > 0, there exists 
6(é/2) > O such that if x € A and O < |x — c| < d(¢/2), then | f(x) — L| < ¢/2. Also there 
exists 5’(¢/2) such that if x € A and 0 < |x —c| < 5'(¢/2), then | f(x) — L’| < ¢/2. Now 
let ô := inf {5(e/2), 5'(e/2)}. Then if x € A and 0 < |x —c| < ô, the Triangle Inequality 
implies that 


|L- L| < |L- f| + |f(x) — L| < £/2 + €/2 =e. 
Since € > 0 is arbitrary, we conclude that L — L’ = 0, so that L = L’. Q.E.D. 
The definition of limit can be very nicely described in terms of neighborhoods. (See 
Figure 4.1.1.) We observe that because 
Vs(c) = (c — 6, c +8) = {x : |x— c| < ô}, 


the inequality 0 < |x — c| <6 is equivalent to saying that x Æ c and x belongs to the 
ô-neighborhood V3(c) of c. Similarly, the inequality | f(x) — L| < € is equivalent to saying 
that f(x) belongs to the -neighborhood V,(L) of L. In this way, we obtain the following 
result. The reader should write out a detailed argument to establish the theorem. 


L 
Given V,(L)-~ 


c Ñ 
There exists V(c) 


Figure 4.1.1 The limit of fat c is L 


4.1.6 Theorem Let f :A — R and let c be a cluster point of A. Then the following 
statements are equivalent. 


(i) limf(x) =L. 
(ii) Given any e-neighborhood V,(L) of L, there exists a 5-neighborhood V;5(c) of ¢ such 
that if x # c is any point in Vs(c) NA, then f(x) belongs to V,(L). 


We now give some examples that illustrate how the definition of limit is applied. 


4.1.7 Examples (a) limb = b. 


To be more explicit, let f(x) := b for all x € IR. We want to show that lim f(x) = b. If 
e > Ois given, we let 5 := 1. (In fact, any strictly positive ô will serve the purpose.) Then if 
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0 <|x-c| <1, we have | f(x) — b| = |b—b|=0 < e. Since ¢ >0 is arbitrary, we 
conclude from Definition 4.1.4 that lim f(x) = b. 
(b) limx=c. S 

Let g(x) := x for all x € R. If > 0, we choose 8(¢) := e. Then if 0 < |x — c| < d(e), 


we have |g(x) — c| = |x — c| < e. Since ¢ > 0 is arbitrary, we deduce that lim g = c. 
xe: 


© limx? =’. 
x—-C 


Let A(x) := x? for all x € R. We want to make the difference 
|A(x) — e| = x? — e| 


less than a preassigned ¢ > 0 by taking x sufficiently close to c. To do so, we note that 
x? — c? = (x+ c)(x — c). Moreover, if |x — c| < 1, then 


|x| < |c| +1 so that Ix +c] < |x| + |e] <2 |e] +1. 
Therefore, if |x — c| < 1, we have 
(1) |x? — ê| = |x + e||x — el < (2|el + 1) |x — cl. 


Moreover this last term will be less than € provided we take |x — c| < é/(2|c|4+ 1). 
Consequently, if we choose 


5(e) := inf? 1,—— 
é) = in eFTS 


then if 0 < |x — c| < 6(e), it will follow first that |x — c| < 1 so that (1) is valid, and 
therefore, since |x — c| < ¢/(2|c| + 1) that 


x? -e| < (2|c| + 1)|x— e| < e. 


Since we have a way of choosing ô(£) > 0 for an arbitrary choice of ¢ > 0, we infer that 
2 


lim h(x) = lim x“ = e. 
1 1 
(d) lim-=- ifc>0. 
x>cx c 
Let g(x) := 1/x for x > 0 and let c > 0. To show that lim ọ = 1/c we wish to make 
the difference x 


less than a preassigned ¢ > 0 by taking x sufficiently close to c > 0. We first note that 


1 1| fl 
Xe cx 


for x > 0. It is useful to get an upper bound for the term 1/(cx) that holds in some 
neighborhood of c. In particular, if |x — c| < łe, then 5c << 3c (why?), so that 


1 2 1 
Oi f -e| < 5c. 
<z or |x cl <5¢ 


Therefore, for these values of x we have 


2) Jo -2] Z-a 
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In order to make this last term less than ¢ it suffices to take |x — c| < 5 Ta é. Consequently, if 
we choose 


1 1 
ôle) := inf4 —c, -eE $, 
2:2 


then if 0 < |x — c| < 4(e), it will follow first that |x — c| < $c so that (2) is valid, and 
therefore, since |x — c| < (4 ce, that 


x 


xX — 4 
li =. 
O e ar S 
Let W(x) := (xX — 4)/(x? + 1) for x€ R. Then a little algebraic manipulation 
gives us 
h ppa — 4x? — 24| 
vix 5 5(x? + 1) 
[5x3 + 6x + 12| 
2 -|x — 2l. 
5(x2 + 1) 


To get a bound on the coefficient of |x — 2|, we restrict x by the condition 1 < x < 3. 
For x in this interval, we have 5x? +6x+12<5- 3°+6-34+12=75 and 
5(x* + 1) > 5(1 +1) = 10, so that 


wo 
Now for given ¢ > 0, we choose 


8(e) = itfi, a}. 


Then if 0 < |x — 2| < d(e), we have |w(x) — (4/5)| < (15/2)|x — 2| < e. Since e > 0 is 
arbitrary, the assertion is proved. 


Sequential Criterion for Limits 


The following important formulation of limit of a function is in terms of limits of 
sequences. This characterization permits the theory of Chapter 3 to be applied to the 
study of limits of functions. 


4.1.8 Theorem (Sequential Criterion) Let f : A — R and let c be a cluster point of A. 
Then the following are equivalent. 

(i) limf=L. 

(ii) For every sequence (x,) in A that converges to c such that x, # c for alln € N, the 
sequence (f(X,)) converges to L. 


Proof. (i) = (ii). Assume f has limit L at c, and suppose (x,,) is a sequence in A with 
lim(x,) = cand x, 4 c for all n. We must prove that the sequence (f(x,)) converges to L. 
Let ¢ > 0 be given. Then by Definition 4.1.4, there exists ô > 0 such that if x € A satisfies 
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0 < |x -—c| <6, then f(x) satisfies | f(x) — L| <. We now apply the definition of 
convergent sequence for the given 6 to obtain a natural number K(6) such that if n > 
K(6) then |x, — c| < ô. But for each such x,, we have | f(x;,) — L| < e. Thus if n > K(6), 
then | f(x,) — L| < e. Therefore, the sequence (f(x,)) converges to L. 

(ii) => (i). [The proof is a contrapositive argument.] If (i) is not true, then there exists 
an é-neighborhood V, (L) such that no matter what 6-neighborhood of c we pick, there 
will be at least one number xs in A N Vs(c) with xs # c such that f(xs) € Va (L). Hence for 
every n € N, the (1/n)-neighborhood of c contains a number x, such that 


0 < |x, —c| < 1/n and xn EA, 
but such that 
| f(xXn) — L| > & for all neN. 


We conclude that the sequence (x,) in A\ {c} converges to c, but the sequence (f(x,)) does 
not converge to L. Therefore we have shown that if (i) is not true, then (ii) is not true. We 
conclude that (ii) implies (i). Q.E.D. 


We shall see in the next section that many of the basic limit properties of functions can 
be established by using corresponding properties for convergent sequences. For example, 
we know from our work with sequences that if (x, is any sequence that converges to a 
number c, then (x2) converges to c’. Therefore, by the sequential criterion, we can 
conclude that the function h(x) := x? has limit lim h(x) =’. 


Divergence Criteria 


It is often important to be able to show (1) that a certain number is not the limit of a function 
at a point, or (ii) that the function does not have a limit at a point. The following result is a 
consequence of (the proof of) Theorem 4.1.8. We leave the details of its proof as an 
important exercise. 


4.1.9 Divergence Criteria Let ACR, let f:A—R and let cE R be a cluster 
point of A. 

(a) IfL € R, thenf does not have limit L at c if and only if there exists a sequence (Xp) in A 
with x, # c for all n € N such that the sequence (x,) converges to c but the sequence 
(f(xn)) does not converge to L. 

(b) The function f does not have a limit at c if and only if there exists a sequence (Xp) in A 
with x, 4 c for all n € N such that the sequence (xn) converges to c but the sequence 
(f(x,)) does not converge in R. 


We now give some applications of this result to show how it can be used. 


4.1.10 Examples (a) lim (1/x) does not exist in R. 


As in Example 4.1.7(d), let g(x) := 1/x for x > 0. However, here we consider c = 0. 
The argument given in Example 4.1.7(d) breaks down if c = 0 since we cannot obtain 
a bound such as that in (2) of that example. Indeed, if we take the sequence (x,,) with 
X,:=1/n for n €N, then lim(x,) = 0, but g(x,) = 1/(1/n) =n. As we know, the 
sequence (y(x,)) = (n) is not convergent in R, since it is not bounded. Hence, by 
Theorem 4.1.9(b), lim (1/x) does not exist in R. 


(b) lim sgn(x) does not exist. 
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Let the signum function sgn be defined by 


+1 for x>0, 
sgn(x) := 0 for x=0, 
—1 for x<0. 


Note that sgn(x) = x/|x| for x Æ 0. (See Figure 4.1.2.) We shall show that sgn does not 
have a limit at x = 0. We shall do this by showing that there is a sequence (x,,) such that 
lim(x;,) = 0, but such that (sgn(x,,)) does not converge. 


| 


$$} -1 


Figure 4.1.2 The signum function 


Indeed, let x, := (—1)"/n for n € N so that lim(x,,) = 0. However, since 
sen(x,) =(—-1)" for néEN, 
it follows from Example 3.4.6(a) that (sgn(x,,)) does not converge. Therefore lim sgn(x) 
does not exist. 2 
(OY lim sin(1/x) does not exist in R. 
Let g(x) := sin(1/x) for x 4 0. (See Figure 4.1.3.) We shall show that g does not have 
a limit at c = 0, by exhibiting two sequences (x,) and (y,,) with x, 4 0 and y,, 4 0 for all 


n € Nand such that lim(x,,) = 0 and lim(y,,) = 0, but such that lim(g(x,)) 4 lim(g(y,)). 
In view of Theorem 4.1.9 this implies that lim g cannot exist. (Explain why.) 


Figure 4.1.3 The function g(x) = sin(1/x)(x 4 0) 


Indeed, we recall from calculus that sin t = 0 if t = nz for n € Z, and that sin t = +1 
if t = $x + 27n forn € Z. Now let x, := 1/nz for n € N; then lim(x,) = 0 and g(x,) = 
sinn =0 for all ne N, so that lim(g(x,)) =0. On the other hand, let y, := 
(Gr + 2nn)' for n€ N; then lim(y,) =0 and g(y,) =sin($2+22n) =1 for all 
n € N, so that lim(g(y,,)) = 1. We conclude that lim sin(1/x) does not exist. 


‘In order to have some interesting applications in this and later examples, we shall make use of well-known 
properties of trigonometric and exponential functions that will be established in Chapter 8. 
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Exercises for Section 4.1 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


Determine a condition on |x — 1| that will assure that: 
(a) |x?-1] <4, (b) |x? -1| < 1/107, 


(c) |x? — 1| <1/n foragivenn E N, (d) 


x—1|<1/n foragivenn E€ N. 
Determine a condition on |x — 4| that will assure that: 


(a) |/x—-2| <4 (b) |./x — 2] < 107. 
Let c be a cluster point of A C R and let f : A > R. Prove that limf(x) = L if and only if 
lim | f(x) — L| = 0. D: 
x0 
Let f := R —> R and let c € R. Show that lim f(x) = L if and only if lim f(x +c)=L. 
ate. x! 
Let J := (0,a) where a > 0, and let g(x) := x? for x € I. For any points x, c € I, show that 
|¢(x) — c| < 2a|x — c|. Use this inequality to prove that lim x? = œ? for any c € I. 


xe 


Let I be an interval in R, let f : J — R, and let c € J. Suppose there exist constants K and L such 
that | f(x) — L| < K|x — c| for x € I. Show that lim f(x) = L. 
Xe 


Show that lim x? = c? for any cE R. 
Ae 


Show that lim yx = Vc for any c > 0. 


P Seas i 
Use either the -ô definition of limit or the Sequential Criterion for limits, to establish the 
following limits. 


inesi D ima 
a) -e ae (b) En D 

o x . xXx-x+1 1 
oe a aa ag a 


Use the definition of limit to show that 


x 2 ms x+5 SG 
(a) lim (x + 4x) S 2 (b) dim ea 4. 
Use the definition of limit to prove the following. 

. 2x+3 1 SA 
(a) pace ae aed (b) 6 x+3 
Show that the following limits do not exist. 

= pil . 1 
(a) er (x > 0), (b) lim ~= (x> 0), 
(c) lim (x + sgn(x)), (d) lim sin(1/x’). 

x0 x0 


Suppose the function f : R — R has limit L at 0, and let a > 0. If g : R — R is defined by 
g(x) :=f(ax) for x € R, show that lim g(x) =L. 
Pam 


Let c € R and let f : R > R be such that lim (f(x))* = L. 
are 
(a) Show that if L = 0, then limf(x) = 0. 
fe a Hs 
(b) Show by example that if L Æ 0, then f may not have a limit at c. 


Let f : R — R be defined by setting f(x) := x if x is rational, and f(x) = 0 if x is irrational. 
(a) Show that f has a limit at x = 0. 
(b) Use a sequential argument to show that if c Æ 0, then f does not have a limit at c. 


Let f : R — R, let Z be an open interval in R, and let c € 7. If fi is the restriction of f to J, show 
that fı has a limit at c if and only if f has a limit at c, and that the limits are equal. 


Let f : R — R, let J be a closed interval in R, and let c € J. If fy is the restriction of f to J, 
show that if fhas a limit at c then f> has a limit at c. Show by example that it does not follow that 
if f> has a limit at c, then f has a limit at c. 
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Section 4.2 Limit Theorems 


We shall now obtain results that are useful in calculating limits of functions. These results 
are parallel to the limit theorems established in Section 3.2 for sequences. In fact, in most 
cases these results can be proved by using Theorem 4.1.8 and results from Section 3.2. 
Alternatively, the results in this section can be proved by using é-6 arguments that are very 
similar to the ones employed in Section 3.2. 


4.2.1 Definition Let A C R, let f : A — R, and let c € R be a cluster point of A. We say 
that fis bounded on a neighborhood of c if there exists a 6-neighborhood V5(c) of c anda 
constant M > 0 such that we have | f(x)| < M for all x € AN Vs(c). 


4.2.2 Theorem /fA C Randf : A — R has a limit atc € R, then f is bounded on some 
neighborhood of c. 
Proof. If L:= limf, then for ¢ = 1, there exists 6 > 0 such that if 0 < |x — c| < 6, then 
| f(x) — L| < 1; hence (by Corollary 2.2.4(a)), 

Fœ- IL < | fx) -L <1. 


Therefore, if x € A N Vs(c), x Æ c, then | f(x)| < |L| + 1. If c € A, we take M = |L| + 1, 
while if c € A we take M := sup{|f(c)|, |L| + 1}. It follows that if x € A N Vs(c), then 
|f(x)| < M. This shows that f is bounded on the neighborhood Vs(c) of c. Q.E.D. 


The next definition is similar to the definition for sums, differences, products, and 
quotients of sequences given in Section 3.2. 


4.2.3 Definition LetA C R and let fand g be functions defined on A to R. We define the 
sum f + g, the difference f — g, and the product fg on A to R to be the functions given by 


FHD) =f(x) ea), (fF — 8)(x) =F) glx), 
(Fa) =f ex) 


for all x € A. Further, if b € R, we define the multiple bf to be the function given by 
(bf)\(x):= bf(x) forall x€A. 
Finally, if A(x) 4 0 for x € A, we define the quotient f/h to be the function given by 


(7) (ace me for all xEA. 


4.2.4 Theorem LetA C R, let fand g be functions on A to R, and let c € R be a cluster 
point of A. Further, let b € R. 
(a) If limf = L and limg = M, then: 

lim(f+g)=L+M, lim(f-g)=L-M, 


x—C 


lim (fg) = LM, lim (bf) = bL. 


x—c 


b) fh: AR, if A(x) £0 for all x € A, and if lim h = H £0, then 


. /[fÍ\_L 
tim (5) =F 
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Proof. One proof of this theorem is exactly similar to that of Theorem 3.2.3. Alterna- 
tively, it can be proved by making use of Theorems 3.2.3 and 4.1.8. For example, let (x,,) be 
any sequence in A such that x, 4 c for n € N, and c = lim(x,). It follows from Theorem 
4.1.8 that 


lim(f(x,)) = L, lim(g(x,)) = M. 
On the other hand, Definition 4.2.3 implies that 
(£8) (%n) = f(Xn)8(%n) for neEN. 
Therefore an application of Theorem 3.2.3 yields 
lim((fg)(Xn)) = lim(f(xn)8(xn)) 
= [lim(f(%n))] [lim(g(xn))] = LM. 
Consequently, it follows from Theorem 4.1.8 that 


lim (fg) = lim((fg)(»)) = LM. 


The other parts of this theorem are proved in a similar manner. We leave the details to 
the reader. Q.E.D. 


Remark Let A C R, and let f1, f,,..., fn be functions on A to R, and let c be a cluster 
point of A. If Lẹ := lim fy for k = 1,...,n, then it follows from Theorem 4.2.4 by an 


Induction argument that 
Li+ lt: t L, = lim (fi +f + Efa), 
and 
Li - Ly +++ Ln = lim( fi -fatt fn) 
In particular, we deduce that if L = lim fandn € N, then 


L” = lim (f(x))”. 


xXx—>c 


4.2.5 Examples (a) Some of the limits that were established in Section 4.1 can be 
proved by using Theorem 4.2.4. For example, it follows from this result that since 
lim x = c, then lim x? = œ, and that if c > 0, then 


"3 Io 1 1 


. 2 3_ 4) — 
(b) lim (x? + 1) (x° — 4) = 20. 
It follows from Theorem 4.2.4 that 


lim (x? + 1) (x° — 4) = ( im (x? + 1) (im (£ -— 4) 


= 5.4 = 20. 
3-4 4 
Teor 
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If we apply Theorem 4.2.4(b), we have 


34 lim (x* — 4) 

lim = x i 

x>2x? +1 lim (x + 1) 5 
x2 


Note that since the limit in the denominator [i.e., lim Ge + 1) = 5] is not equal to 0, then 
Theorem 4.2.4(b) is applicable. ote 
x7-4 4 
d) lim = ———=.j. 
Bs a a 


If we let f(x) := x? — 4 and A(x) := 3x — 6 for x € R, then we cannot use Theorem 
4.2.4(b) to evaluate lim (f(x) /h(x)) because 


H = lim h(x) = lim 3x — 6) =3-2—-6=0. 
However, if x Æ 2, then it follows that 


x*-4 (x+2)(x-2) 1 


6) 8G a 


Therefore we have 
2 


li Ta : 2) =! li 2; = 
eee ee = lim; (x + Jaz oe a3 


Note that the function g(x) = (x? — 4)/(3x — 6) has a limit at x = 2 even though it is not 
defined there. 


1 
(e) lim — does not exist in R. 
x0 X 
Of course lim 1 = 1 and H := lim x = 0. However, since H = 0, we cannot use 
Theorem 4.2.4(b) to evaluate lim (1/x). In fact, as was seen in Example 4.1.10(a), the 


function g(x) = 1/x does not have a limit at x = 0. This conclusion also follows from 
Theorem 4.2.2 since the function g(x) = 1/x is not bounded on a neighborhood of x = 0. 
(£) If p is a polynomial function, then lim p(x) = p(c). 

Let p be a polynomial function on R so that p(x) = anx” + an1 xX"! +++» + a1x + 
do for all x € R. It follows from Theorem 4.2.4 and the fact that lim x* = c* that 


lim p(x) = lim [anx" Ade” Moke axe ao] 
BFE ae 


II 


lim (an x”) + lim (anx!) + lim (aix) + lim ao 
= ac’ +a 1c"! hae a 
= p(c). 
Hence lim p(x) = p(c) for any polynomial function p. 
(g) If Tand q are polynomial functions on R and if g(c) 4 0, then 
p(x) _ p(o) 


im 
x>eq(x) — g(c) 
Since g(x) is a polynomial function, it follows from a theorem in algebra that there are at most 


a finite number of real numbers a, . . . , @, [the real zeroes of g(x)] such that qla) = 0 and 
such that if x € {a1,..., 0m}, then g(x) Æ 0. Hence, if x é {a),...,a@m}, we can define 
r(x) := py) 


q(x) 
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If cis not a zero of q(x), then q(c) # 0, and it follows from part (f) that lim q(x) = q(c) 4 0. 

Therefore we can apply Theorem 4.2.4(b) to conclude that eo 
jim POX) = eee _ ale) 
im == = : 
weq(x) — limg(x) — g(c) 


The next result is a direct analogue of Theorem 3.2.6. 


4.2.6 Theorem Let ACR, let f : A — R, and let c € R be a cluster point of A. If 
a<f(x)<b forall xEA,x Fc, 

and if lim f exists, then a < lim f < b. 

Proof. Indeed, if L = lim f, then it follows from Theorem 4.1.8 that if (x,,) is any 

sequence of real numbers such that c # x, E€ A for all n € N and if the sequence (x) 


converges to c, then the sequence (f(x,)) converges to L. Since a < f(x,) < b for all 
n € N, it follows from Theorem 3.2.6 that a < L < b. Q.E.D. 


We now state an analogue of the Squeeze Theorem 3.2.7. We leave its proof to the reader. 


4.2.7 Squeeze Theorem LetA C R, let f, g, h: A — R, and let c € R be a cluster point 
of A. If 
f(x) < g(x) < h(x) forall xEA,x#c, 


and if limf = L = lim A, then lim g = L. 
x—>c x—>c xc 


4.2.8 Examples (a) lim xX = 0(x > 0). 


Let f(x) := x?/? for x > 0. Since the inequality x < x!/? < 1 holds for 0 < x < 1 
(why?), it follows that x? < f(x) = x3/2 < x for 0 < x < 1. Since 


limx*=0 and limx=0 


x30 x 3 


it follows from the Squeeze Theorem 4.2.7 that lim x2 — 0. 
(b) lim sin x = 0. 
It will be proved later (see Theorem 8.4.8), that 


—x <sinx<x forall x>0. 


Since lim (+x) = 0, it follows from the Squeeze Theorem that lim sin x = 0. 


(c) lim cosx= |1. 
Tt will be proved later (see Theorem 8.4.8) that 


(1) 1-5x?<cosx<1 forall x€R. 


Since lim (1 — 5x7) = l, it follows from the Squeeze Theorem that lim cos x = 1. 


(d) lim (=) =0. 
x—0 x 
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We cannot use Theorem 4.2.4(b) to evaluate this limit. (Why not?) However, it follows 
from the inequality (1) in part (c) that 


—43x<(cosx—1)/x<0 for x>0 
and that 
0<(cosx—1)/x<—-jx for x<0. 


Now let f(x) := —x/2 for x > 0 and f(x) := 0 for x < 0, and let A(x) := 0 for x > 0 and 
h(x) := —x/2 for x < 0. Then we have 


f(x) < (cosx—1)/x < h(x) for x40. 


Since it is readily seen that lim f = 0 = lim h, it follows from the Squeeze Theorem that 
lim (cos x — 1)/x = 0. mp rR 


(e) lim (=) ap 
x0 x 


Again we cannot use Theorem 4.2.4(b) to evaluate this limit. However, it will be 
proved later (see Theorem 8.4.8) that 


x-gx3<sinx<x for x>0 


and that 


x<sinx<x—-4x3 for x <0. 


Therefore it follows (why?) that 
1— 4x? <(sinx)/x <1 forall x40. 
But since lim (1 — tx?) IS H lim x? = 1, we infer from the Squeeze Theorem that 
lim (sinx)/x = 1. 
(f) lim (x sin(1/x)) = 0. 


Let f(x) = xsin(1/x) for x 40. Since —1 < sinz < 1 for all z € R, we have the 
inequality 


-|x| < fx) = xsin(1/x) < |x| 


forall x € R, x 4 0. Since lim |x| = 0, it follows from the Squeeze Theorem that lim f=0. 


For a graph, see Figure 5.1.3 or the cover of this book. 


There are results that are parallel to Theorems 3.2.9 and 3.2.10; however, we will leave 
them as exercises. We conclude this section with a result that is, in some sense, a partial 
converse to Theorem 4.2.6. 


4.2.9 Theorem Let A CR, let f : A — R and let c € R be a cluster point of A. If 
limf > 0 [respectively, limf < 0), 


then there exists a neighborhood V;3(c) of c such that f(x) > 0 [respectively, f(x) < 0] for 
all x € AN Vs(c), x # c. 
Proof. LetL := limf and suppose that L > 0. We take ¢ = 5L > Oin Definition 4.1.4, and 


obtain a number ô > 0 such that if 0 < |x — c| < ô and x € A, then | f(x) — L| < $L. 
Therefore (why?) it follows that if x € ANV;(c), x # c, then f(x) >5L > 0. 
If L < 0, a similar argument applies. QED. 
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Exercises for Section 4.2 


1. 


10. 


11. 


12. 


13. 


14. 


15. 


Apply Theorem 4.2.4 to determine the following limits: 


$ 2 hh 
(a) lim(x+1)(2x+3) (x eR), O) imt eG), 
ao oe eos 
” 7 x 


Determine the following limits and state which theorems are used in each case. (You may wish 


to use Exercise 15 below.) 
2 


2. 1 = 
(a) im Z (x>0), (b) lim% X >0), 
x=2 V x+3 x32 X— 
2 z 
eria ET sa @ mT (> 0) 
x0 x lim X= 
vI+2x- vI 
Find lim aise Toy where x > 0. 
x>0 x + 2x? 


Prove that lim cos(1/x) does not exist but that lim x cos(1/x) = 0. 
x x 
Let f, g be defined on A C R to R, and let c be a cluster point of A. Suppose that fis bounded on a 
neighborhood of c and that lim 1g = 0. Prove that lim fg = =0. 
Use the definition of the limit to prove the first Ascenion in Theorem 4.2.4(a). 
Use the sequential formulation of the limit to prove Theorem 4.2.4(b). 
Let n € N be such that n > 3. Derive the inequality —x? < x” < x? for —1 < x < 1. Then use 
the fact that lim x? = 0 to show that lim x"=0. 
ae E amad 
Let f, g be defined on A to R and let c be a cluster point of A. 
(a) Show that if both lim f and lim (f + g) exist, then lim 1g exists. 
E X> 


(b) If lim af and lim uf g E dies it follow that lim g exists? 


aed i 
Give scamples of functions fand g such that fand g do not have limits at a point c, but such that 
both f + g and fg have limits at c. 
Determine whether the following limits exist in R. 
(a) limsin(1/x’) (x 40) (b) limxsin(1/x?) (x 40) 
x0 i x0 2 
: : : A 2 
(c) lim sgn sin(1/x) (x #0), (d) lim vx sin(1/x ) (x > 0), 
Let f : R > R be such that f(x + y) = f(x) + f(y) for all x, y in R. Assume that lim f =L 
x! 

exists. Prove that L = 0, and then prove that fhas a limit at every point c € R. [Hint: First note that 
f(2x) =f (x) +f (x) = 2f(x) for x € R. Also note that f(x) = f(x — c) + f(c) for x, c in R] 
Functions f and g are defined on R by f(x) := x + 1 and g(x) := 2 if x # 1 and g(1) := 0. 
(a) Find lim g(f(x)) and compare with the value of g (lim f(x). 

x> x3 
(b) Find lim f(g (x)) and compare with the value of f (lim g(x)). 

x> h aad 
Let A C R, let f : A — R and let c € R be a cluster point of A. If lim f exists, and if |f| denotes 


x—-c 
the function defined for x € A by |f|(x) :=|f(x)|, prove that lim |f| = 
Xe. x—-Cc 
Let A CR, let f : A — R, and let c € R be a cluster point of A. In addition, suppose that 
f(x) > 0 for all x € A, and let yf be the function defined for x € A by (yf) (x) := \/f(x). If 
limf exists, prove that lim Jf = ,/limf. 
P ea Ee b tasa H xe 


Section 4.3 Some Extensions of the Limit Concept! 


In this section, we shall present three types of extensions of the notion of a limit of a 
function that often occur. Since all the ideas here are closely parallel to ones we have 
already encountered, this section can be read easily. 


İ This section can be largely omitted on a first reading of this chapter. 
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One-Sided Limits 


There are times when a function f may not possess a limit at a point c, yet a limit 
does exist when the function is restricted to an interval on one side of the cluster 
point c. 

For example, the signum function considered in Example 4.1.10(b), and illustra- 
ted in Figure 4.1.2, has no limit at c = 0. However, if we restrict the signum function 
to the interval (0, co), the resulting function has a limit of 1 at c = 0. Similarly, if 
we restrict the signum function to the interval (—oo, 0), the resulting function has a limit 
of —1 at c = 0. These are elementary examples of right-hand and left-hand limits at 
c=0. 


4.3.1 Definition Let A € R and let f:A—-R. 


(i) Ifc € Risa cluster point of the set AM (c, 00) = {x € A: x > c}, then we say that 
L €R is a right-hand limit of f at c and we write 


lim f=L or lim f(x) = 


x—c+ 


if given any €> Q there exists a 5=6(e) >0 such that for all x€A with 
0<x-—c <6, then | f(x) —L| < e. 

(ii) Ifc € Risa cluster point of the set A N (—0o0, c) = {x € A: x < c}, then we say that 
L € Ris a left-hand limit of f at c and we write 


lim f=L or lim f(x) = 


xe 


if given any € > 0 there exists a ô > 0 such that for all x € A with O <c—x < ô, 
then | f(x) — L| < e. 


Notes (1) The limits jim if and jim 1 f are called one- -sided limits of f at c. It is possible 
that neither one-sided limit may exist. Also, one of them may exist without the other 
existing. Similarly, as is the case for f(x) := sgn(x) at c = 0, they may both exist and be 
different. 

(2) If A is an interval with left endpoint c, then it is readily seen that f : A — R has a limit 
at c if and only if it has a right-hand limit at c. Moreover, in this case the limit lim f and the 
right-hand limit Jim. f are equal. (A similar situation occurs for the left-hand limit when A 
is an interval with right endpoint c.) 


The reader can show that f can have only one right-hand (respectively, left-hand) limit 
at a point. There are results analogous to those established in Sections 4.1 and 4.2 for two- 
sided limits. In particular, the existence of one-sided limits can be reduced to sequential 
considerations. 


4.3.2 Theorem LetA CR, /letf : A — R, and let c € R be a cluster point of A N (c, 0). 
Then the following statements are equivalent: 


G)  limf=L. 


XF 
(ii) For every sequence (x„) that converges to c such that x, € A and x, > c for all 
n € N, the sequence (f(xn)) converges to L. 
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yA 


Figure 4.3.2 Graph of 
Figure 4.3.1 Graph of A(x) =1/(e’/*+1) (x40) 
g(x) =el/* (x #0) 


We leave the proof of this result (and the formulation and proof of the analogous 
result for left-hand limits) to the reader. We will not take the space to write out the 
formulations of the one-sided version of the other results in Sections 4.1 and 4.2. 

The following result relates the notion of the limit of a function to one-sided limits. We 
leave its proof as an exercise. 


4.3.3 Theorem Let A C R, let f : A — R, and let c € R be a cluster point of both 
of the sets AM (c,oo) and AN(-c,c). Then limf=L if and only if 
Jim f= L= jim f. = 

4.3.4 Examples (a) Let f(x) := sgn(x). 

We have seen in Example 4.1.10(b) that sgn does not have a limit at 0. It is clear that 
iim sgn(x) = +1 and that iim sgn(x) = — 1. Since these one-sided limits are different, it 
also follows from Theorem 4.3.3 that sgn(x) does not have a limit at 0. 

(b) Let g(x) := e!/* for x Æ 0. (See Figure 4.3.1.) 

We first show that g does not have a finite right-hand limit at c = 0 since it is 
not bounded on any right-hand neighborhood (0, ô) of 0. We shall make use of the 
inequality 


(1) O<t<e’ for t>0, 


which will be proved later (see Corollary 8.3.3). It follows from (1) that if x > 0, then 
0 < 1/x <e!/*. Hence, if we take x, = 1/n, then g(x,) >n for all n € N. Therefore 
lim e!/* does not exist in R. 

~> However, Jim e!/¥ = 0. Indeed, if x < 0 and we take t = —1 /x in (1) we obtain 


0 < —1/x < e~'/*. Since x < 0, this implies that 0 < e!/* < —x for all x < 0. It follows 
from this inequality that lim el/* — 0, 


(c) Let h(x) := 1/(e!/* + 1) for x £0. (See Figure 4.3.2.) 
We have seen in part (b) that 0 < 1/x < e!/* for x > 0, whence 


1 
Oem ee 


which implies that lim h=0. 
x—0+ 
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Since we have seen in part (b) that lim e!/* = 0, it follows from the analogue of 
Theorem 4.2.4(b) for left-hand limits that 


1 1 1 
li = = =1. 
tim (Sq) lim e/*¥+1 0+1 
x-0- 


Note that for this function, both one-sided limits exist in R, but they are unequal. 


Infinite Limits 


The function f(x) := 1/x? for x Æ 0 (see Figure 4.3.3) is not bounded on a neighborhood 
of 0, so it cannot have a limit in the sense of Definition 4.1.4. While the symbols 
oo(= +00) and —oo do not represent real numbers, it is sometimes useful to be able to 
say that “f(x) = 1/x? tends to co as x — 0.” This use of +00 will not cause any 
difficulties, provided we exercise caution and never interpret oo or —oo as being real 
numbers. 


Figure 4.3.3 Graph of 
f(x) = 1/x? (x #0) Figure 4.3.4 Graph of 
g(x) =1/x (x0) 


4.3.5 Definition Let A CR, let f: A — R, and let c € R be a cluster point of A. 
(i) We say that f tends to oo as x — c, and write 


lim f = o, 


if for every œ E€ R there exists 5=6(a) >0 such that for all x€A with 
0 < |x— c| < ô, then f(x) >a. 
(ii) We say that f tends to —oo as x — c, and write 


lim f = —o0, 


XÁC 


if for every €R there exists 5=6(f8) > 0 such that for all x€A with 
0 < |x— c| < ô, then f(x) < 2. 


4.3.6 Examples (a) lim (1/2) = œ. 

For, if œ > Ois given, let ô := 1/,/a. It follows that if 0 < |x| < ô, then x? < 1/a so 
that 1/x? > a. 
(b) Let g(x) := 1/x for x Æ 0. (See Figure 4.3.4.) 
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The function g does not tend to either 00 or —o0 as x — 0. For, ifa > 0 then g(x) <a 
for all x < 0, so that g does not tend to œo as x — 0. Similarly, if 8 < O then g(x) > £ forall 
x > 0, so that g does not tend to —oo as x — 0. 


While many of the results in Sections 4.1 and 4.2 have extensions to this limiting 
notion, not all of them do since too are not real numbers. The following result is an 
analogue of the Squeeze Theorem 4.2.7. (See also Theorem 3.6.4.) 


4.3.7 Theorem Let A C R, let f,g : A — R, and let c € R be a cluster point of A. 
Suppose that f(x) < g(x) for all x € A,x £ c. 


(a) If limf = œ, then lim g = oo. 


(b) Jf lim g = —o«, then limf = —oo. 


Proof. (a) If limf = oo and æ € R is given, then there exists (a) > 0 such that if 


0 < |x— c| < d(a) and x € A, then f(x) > a. But since f(x) < g(x) for all x € A, x # c, 
it follows that if 0 < |x — c| < ô(œ) and x € A, then g(x) > «œ. Therefore lim g = oo. 


The proof of (b) is similar. Q.E.D. 


The function g(x) = 1/x considered in Example 4.3.6(b) suggests that it might be 
useful to consider one-sided infinite limits. We will define only right-hand infinite 
limits. 


4.3.8 Definition Let A C R and let f : A — R. If c € R is a cluster point of the set 
AN (c, œ) = {x €A:x>c}, then we say that f tends to oo [respectively, —co] as 
x — c+, and we write 

lim f = œ [respectively, lim f=-ool, 


P ua aE 


if for every a € R there is 5 = 6(a@) > 0 such that for all x € A with 0 < x — c < 6, then 
f(x) > a [respectively, f(x) < a]. 

4.3.9 Examples (a) Let g(x) := 1/x for x 4 0. We have noted in Example 4.3.6(b) that 
lim g does not exist. However, it is an easy exercise to show that 

Jim (1/x) =co and Jim (1/x) = -o, 

(b) It was seen in Example 4.3.4(b) that the function g(x) := e!/* for x #0 is not 


bounded on any interval (0, 6), 6 > 0. Hence the right-hand limit of e!/* as x — 0+ does 
not exist in the sense of Definition 4.3.1(i). However, since 


1/x<e!/* fo x>0, 


it is readily seen that lim e'/* = oo in the sense of Definition 4.3.8. 
x-0+ 


Limits at Infinity 


It is also desirable to define the notion of the limit of a function as x — oo. The definition as 
xX — —oo is similar. 
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4.3.10 Definition Let A C R and let f : A — R. Suppose that (a,o0) C A for some 
a € R. We say that L € R is a limit of f as x — oo, and write 
limf=L or lim f(x)=L, 


x-CO 


if given any € > 0 there exists K = K (e) > a such that for any x > K, then | f(x) — L| < e. 


The reader should note the close resemblance between 4.3.10 and the definition of a 
limit of a sequence. 

We leave it to the reader to show that the limits of fas x — +00 are unique whenever 
they exist. We also have sequential criteria for these limits; we shall only state the criterion 
as x — oo. This uses the notion of the limit of a properly divergent sequence (see 
Definition 3.6.1). 


4.3.11 Theorem Let ACR, let f:A—R, and suppose that (a,oo) CA for some 
a € R. Then the following statements are equivalent: 


(i) L= limf. 


Gi) For every sequence (x„) in A N (a, co) such that lim(x,) = oo, the sequence ( f (Xn)) 
converges to L. 


We leave it to the reader to prove this theorem and to formulate and prove the 
companion result concerning the limit as x — —oo. 


4.3.12 Examples (a) Let g(x) := 1/x for x 40. 
It is an elementary exercise to show that lim (1/x) =0 = lim (1/x). (See Figure 
4.3.4.) aes x 
(b) Let f(x) := 1/x? for x 40. 
The reader may show that Jim (1/2") =0= jim _(1/x"). (See Figure 4.3.3.) One 
way to do this is to show that if x > 1 then 0 < 1/x? < 1/x. In view of part (a), this implies 
that lim (1/x°) =0. 


Just as it is convenient to be able to say that f(x) + too as x > c for c € R, it is 
convenient to have the corresponding notion as x — +00. We will treat the case where 
X — OO. 


4.3.13 Definition Let A C R and let f : A — R. Suppose that (a,co) C A for some 
a € A. We say that f tends to oo [respectively, —co] as x — oo, and write 


lim f = co [respectively, lim f = —oo 


x7CO 


if given any a € R there exists K = K(a) > a such that for any x > K, then f(x) > « 
[respectively, f(x) < a]. 


As before there is a sequential criterion for this limit. 


4.3.14 Theorem Let A €R, let f:A—R, and suppose that (a,oo) CA for some 
a € R. Then the following statements are equivalent: 
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(i) lim f = oo [respectively, Jim f = = —oo]. 


(ii) For every sequence (X,) in ia oo)such that lim(x,) = co, then lim(f(xn)) = co 
(respectively, lim(f(X»)) = —co]. 


The next result is an analogue of Theorem 3.6.5. 


4.3.15 Theorem Let ACR, let f,g: A — R, and suppose that (a,oo) C A for some 
a € R. Suppose further that g(x) > 0 for all x > a and that for some L € R, L £ 0, we have 


(i) IfL > 0, then lim f = œ if and only if lim g = œ 
(ii) If L <0, then lim f = —oo if and only if lim g = œo 


Proof. (i) Since L > 0, the hypothesis implies that there exists a; > a such that 


1, pi Ls 3 
SLL ALEE fi ; 
On Ge), oo or x> 


Therefore we have ( L)g(x) < f(x) < < L) g(x) for all x > ay, from which the conclusion 
follows readily. 
The proof of (ii) is similar. Q.E.D. 


We leave it to the reader to formulate the analogous result as x — —oo. 


4.3.16 Examples (a) lim x" = œ for n € N. 


Let g(x) := x” for x € (0,00). Given æ € R, let K := sup{1,a@}. Then for all x > K, 
we have g(x) = x" > x >a. Since a € R is arbitrary, it follows that lim g = oo. 


(b) lim x” = œ for n € N, n even, and Jim x” = —oo for n € N, n odd. 

We will treat the case n odd, say n = Ik Fa 1 with k = 0, 1, . Given a € R, let 
K := inf{a,—1}. For any x < K, then since (x?)* > 1, we Ee = = (2)Fx <x<a. 
Since a € R is arbitrary, it follows that lim x" = —oo. 


(c) Let p: R— R be the polynomial function 
P(X) := px + yx b+ bax + ao. 


Then lim p = œ if a, > 0, and lim p = —œ if a, < 0. 
Indeed, let g(x) := x” and apply Theorem 4.3.15. Since 


aa ald} 


it follows that lim (p(x)/g(x)) = an. Since lim g =o, the assertion follows from 

Theorem 4.3.15% Te 

(d) Let p be the polynomial function in part (c). Then | im 1 p= [respectively, —oo] if 

n is even [respectively, odd] and a, > 0. 
We leave the details to the reader. 


4.3. SOME EXTENSIONS OF THE LIMIT CONCEPT 123 


Exercises for Section 4.3 


B5 SO ae 


13. 


Prove Theorem 4.3.2. 

Give an example of a function that has a right-hand limit but not a left-hand limit at a point. 
Let f(x) := |x|~'”* for x 4 0. Show that lim f(x) = dim f(x) = +00. 

Let c € R and let f be defined for x € (c,oo) and f(x) > 0 for all x € (c,o0). Show that 
limf = œ if and only if lim 1/f =0. 


aE: 
Evaluate the following limits, or show that they do not exist. 


(a) lim — (x41), (b) lim ~— (x41), 

(c) lim (x +2)/Vx (x > 0), (d) lim (x+2)/Vx (x >0), 

(e) lim(Vvx+1)/x (x>-1), (f) lim (Vx+1)/x (x>0), 
a ee 5 x a We: Sie 


Prove Theorem 4.3.11. 
Suppose that f and g have limits in R as x — oo and that f(x) < g(x) for all x € (a, 00). Prove 
that lim f < lim g. 
x—00 IO 
Let f be defined on (0, œo) to R. Prove that lim f(x) = L if and only if lim f/x) =L. 
X00 x—-0+ 
Show that if f : (a,00) > R is such that lim xf(x) = L where L € R, then lim f(x) = 0. 
X00 X00 
Prove Theorem 4.3.14. 
Suppose that lim f(x) = L where L > 0, and that lim g(x) = oo. Show that lim f(x)g(x) = oo. 
b ami Oy P a t Pe Yom $9 
If L = 0, show by example that this conclusion may fail. 
Find functions f and g defined on (0, œo) such that lim f = oo and lim g = on, and 
x—00 x—00 
lim (f — g) = 0. Can you find such functions, with g(x) > 0 for all x € (0,00), such that 
x—0o0 
lim f/g = 0? 
xX—0o0 
Let f and g be defined on (a, oo) and suppose lim f =L and lim g = ow. Prove that 
x— 00 x—00 


lim f o g = L. 


X00) 


CHAPTER 5 


CONTINUOUS FUNCTIONS 


We now begin the study of the most important class of functions that arises in real analysis: 
the class of continuous functions. The term “continuous” has been used since the time of 
Newton to refer to the motion of bodies or to describe an unbroken curve, but it was not made 
precise until the nineteenth century. Work of Bernhard Bolzano in 1817 and Augustin-Louis 
Cauchy in 1821 identified continuity as a very significant property of functions and proposed 
definitions, but since the concept is tied to that of limit, it was the careful work of Karl 
Weierstrass in the 1870s that brought proper understanding to the idea of continuity. 

We will first define the notions of continuity at a point and continuity on a set, and then 
show that various combinations of continuous functions give rise to continuous functions. 
Then in Section 5.3 we establish the fundamental properties that make continuous functions 
so important. For instance, we will prove that a continuous function on a closed bounded 
interval must attain a maximum and a minimum value. We also prove that a continuous 
function must take on every value intermediate to any two values it attains. These properties 
and others are not possessed by general functions, as various examples illustrate, and thus 
they distinguish continuous functions as a very special class of functions. 

In Section 5.4 we introduce the very important notion of uniform continuity. The 
distinction between continuity and uniform continuity is somewhat subtle and was not fully 
appreciated until the work of Weierstrass and the mathematicians of his era, but it proved to 
be very significant in applications. We present one application to the idea of approximating 
continuous functions by more elementary functions (such as polynomials). 


Karl Weierstrass 

Karl Weierstrass (=WeierstraB) (1815-1897) was born in Westphalia, 
Germany. His father, a customs officer in a salt works, insisted that he study 
law and public finance at the University of Bonn, but he had more interest in 
drinking and fencing, and left Bonn without receiving a diploma. He then 
enrolled in the Academy of Munster where he studied mathematics with 
Christoph Gudermann. From 1841 to 1854 he taught at various gymnasia in 
Prussia. Despite the fact that he had no contact with the mathematical world 
during this time, he worked hard on mathematical research and was able to 
publish a few papers, one of which attracted considerable attention. Indeed, 
the University of Konigsberg gave him an honorary doctoral degree for this work in 1855. The next 
year, he secured positions at the Industrial Institute of Berlin and the University of Berlin. He 
remained at Berlin until his death. 

A methodical and painstaking scholar, Weierstrass distrusted intuition and worked to put 
everything on a firm and logical foundation. He did fundamental work on the foundations of 
arithmetic and analysis, on complex analysis, the calculus of variations, and algebraic geometry. 
Due to his meticulous preparation, he was an extremely popular lecturer; it was not unusual for 
him to speak about advanced mathematical topics to audiences of more than 250. Among his 
auditors are counted Georg Cantor, Sonya Kovalevsky, Gosta Mittag-Leffler, Max Planck, Otto 
Holder, David Hilbert, and Oskar Bolza (who had many American doctoral students). Through his 
writings and his lectures, Weierstrass had a profound influence on contemporary mathematics. 
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The notion of a “gauge” is introduced in Section 5.5 and is used to provide an alternative 
method of proving the fundamental properties of continuous functions. The main signifi- 
cance of this concept, however, is in the area of integration theory where gauges are essential 
in defining the generalized Riemann integral. This will be discussed in Chapter 10. 

Monotone functions are an important class of functions with strong continuity 
properties and they are discussed in Section 5.6. 


Section 5.1 Continuous Functions 


In this section, which is very similar to Section 4.1, we will define what it means to say that 
a function is continuous at a point, or on a set. This notion of continuity is one of the central 
concepts of mathematical analysis, and it will be used in almost all of the following 
material in this book. Consequently, it is essential that the reader master it. 


5.1.1 Definition Let A C R, let f : A — R, and let c € A. We say that fis continuous at 
c if, given any number e > 0, there exists ô > O such that if x is any point of A satisfying 
|x — c| < ô, then | f(x) —f(c)| < e. 

If f fails to be continuous at c, then we say that f is discontinuous at c. 


As with the definition of limit, the definition of continuity at a point can be formulated 


very nicely in terms of neighborhoods. This is done in the next result. We leave the 
verification as an important exercise for the reader. See Figure 5.1.1. 


VeCf (c)) {Fle 


Figure 5.1.1 Given V,(f(c)), a neighborhood V5(c) is to be determined 


5.1.2 Theorem A function f : A — R is continuous at a point c € A if and only if given 
any &-neighborhood V,( f(c)) of f(c) there exists a d-neighborhood V5(c) of c such that if 
x is any point of AN Vs(c), then f(x) belongs to V,(f(c)), that is, 


FAN Vs(c)) S Ve(F(E)). 


Remarks (1) Ifc €A is a cluster point of A, then a comparison of Definitions 4.1.4 
and 5.1.1 show that f is continuous at c if and only if 


(1) f(c) = lim f(x). 


x—-c 
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Thus, if c is a cluster point of A, then three conditions must hold for f to be continuous at c: 

(i) f must be defined at c (so that f(c) makes sense), 

(ii) the limit of f at c must exist in R (so that lim f (x) makes sense), and 

(iii) these two values must be equal. sie 
(2) Ifc €A is not a cluster point of A, then there exists a neighborhood V3(c) of c such 
that A N V5(c) = {c}. Thus we conclude that a function fis automatically continuous at a 
point c € A that is not a cluster point of A. Such points are often called “isolated points” of 
A. They are of little practical interest to us, since they have no relation to a limiting process. 
Since continuity is automatic for such points, we generally test for continuity only at cluster 
points. Thus we regard condition (1) as being characteristic for continuity at c. 


A slight modification of the proof of Theorem 4.1.8 for limits yields the following 
sequential version of continuity at a point. 


5.1.3 Sequential Criterion for Continuity A function f : A — R is continuous at the 
point c € A if and only if for every sequence (x,) in A that converges to c, the sequence 
(f(xn)) converges to f(c). 


The following Discontinuity Criterion is a consequence of the last theorem. It should 
be compared with the Divergence Criterion 4.1.9(a) with L = f(c). Its proof should be 
written out in detail by the reader. 


5.1.4 Discontinuity Criterion Let A CR, let f:A—R, and let c € A. Then f is 
discontinuous at c if and only if there exists a sequence (X,) in A such that (x,) converges 
to c, but the sequence (f(xn)) does not converge to f(c). 


So far we have discussed continuity at a point. To talk about the continuity of a 
function on a set, we will simply require that the function be continuous at each point of the 
set. We state this formally in the next definition. 


5.1.5 Definition Let A C R and let f : A — R. If B is a subset of A, we say that f is 
continuous on the set B if f is continuous at every point of B. 


5.1.6 Examples (a) The constant function f(x) := b is continuous on R. 

It was seen in Example 4.1.7(a) that if c € R, then lim f(x) = b. Since f(c) = b, we 
have lim f(x) =f(c), and thus f is continuous at every point c € R. Therefore f is 
continuous on R. 

(b) g(x) := x is continuous on R. 

It was seen in Example 4.1.7(b) that if c € R, then we have lim g = c. Since g(c) = c, 
then g is continuous at every point c € R. Thus g is continuous on R. 

(c) A(x) := x? is continuous on R. 

It was seen in Example 4.1.7(c) that if c € R, then we have lim h =’. Since 
h(c) = c2, then h is continuous at every point c € R. Thus / is continuous on R. 

(d) g(x) := 1/x is continuous on A := {x € R : x > 0}. 
It was seen in Example 4.1.7(d) that if c € A, then we have limg = 1/c. Since 


xe 


y(c) = 1/c, this shows that g is continuous at every point c € A. Thus g is continuous on A. 
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(e) g(x) := 1/x is not continuous at x = 0. 

Indeed, if v(x) = 1/x for x > 0, then g is not defined for x = 0, so it cannot be 
continuous there. Alternatively, it was seen in Example 4.1.10(a) that lim g does not exist 
in R, so g cannot be continuous at x = 0. ‘at 
(f) The signum function sgn is not continuous at 0. 

The signum function was defined in Example 4.1.10(b), where it was also shown that 
lim sgn(x) does not exist in R. Therefore sgn is not continuous at x = 0 (even though sgn 0 


is defined). It is an exercise to show that sgn is continuous at every point c Æ 0. 


Note In the next two examples, we introduce functions that played a significant role in 
the development of real analysis. Discontinuities are emphasized and it is not possible to 
graph either of them satisfactorily. The intuitive idea of drawing a curve in the plane to 
represent a function simply does not apply, and plotting a handful of points gives only a hint 
of their character. In the nineteenth century, these functions clearly demonstrated the need 
for a precise and rigorous treatment of the basic concepts of analysis. They will reappear in 
later sections. 


(g) Let A := R and let f be Dirichlet’s “discontinuous function” defined by 


Fae 1 if xisrational, 
© \0 if xis irrational. 


We claim that fis not continuous at any point of R. (This function was introduced in 1829 
by P. G. L. Dirichlet.) 

Indeed, if c is a rational number, let (x„) be a sequence of irrational numbers that 
converges to c. (Corollary 2.4.9 to the Density Theorem 2.4.8 assures us that such a 
sequence does exist.) Since f(x,) =0 for all n € N, we have lim(f(x,)) = 0, while 
f(c) = 1. Therefore f is not continuous at the rational number c. 

On the other hand, if b is an irrational number, let (y,) be a sequence of rational 
numbers that converge to b. (The Density Theorem 2.4.8 assures us that such a sequence 
does exist.) Since f(y,) = 1 for all n € N, we have lim(f(y,,)) = 1, while f(b) = 0. 
Therefore f is not continuous at the irrational number b. 

Since every real number is either rational or irrational, we deduce that f is not 
continuous at any point in R. 
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Figure 5.1.2 Thomae’s function 


128 CHAPTER 5 CONTINUOUS FUNCTIONS 


(h) Let A := {x € R: x > 0}. For any irrational number x > 0 we define A(x) := 0. 
For a rational number in A of the form m/n, with natural numbers m, n having no 
common factors except 1, we define h(m/n) := 1/n. (We also define h(0) := 1.) 


We claim that A is continuous at every irrational number in A, and is discontinuous at 
every rational number in A. (This function was introduced in 1875 by K. J. Thomae.) 

Indeed, if a > 0 is rational, let (x,) be a sequence of irrational numbers in 
A that converges to a. Then lim(A(x,)) = 0, while A(a) > 0. Hence h is discontinuous 
at a. 

On the other hand, if b is an irrational number and e > 0, then (by the Archimedean 
Property) there is a natural number no such that 1/nọ < e. There are only a finite number 
of rationals with denominator less than no in the interval (b — 1,5 + 1). (Why?) Hence 
ô > 0 can be chosen so small that the neighborhood (b — ô, b + ô) contains no rational 
numbers with denominator less than nọ. It then follows that for |x — b| < ô,x € A, 
we have |h(x) — A(b)| = |h(x)| < 1/no < e. Thus h is continuous at the irrational 
number b. 

Consequently, we deduce that Thomae’s function / is continuous precisely at the 
irrational points in A. (See Figure 5.1.2.) 


5.1.7 Remarks (a) Sometimes a function f: A — R is not continuous at a point c 
because it is not defined at this point. However, if the function fhas a limit L at the point c 
and if we define F on A U {c} > R by 


L for x=c, 
EO ee for x€A, 


then F is continuous at c. To see this, one needs to check that lim F = L, but this follows 
(why?), since lim f = L. ali 

(b) If a function g : A — R does not have a limit at c, then there is no way that we can 
obtain a function G : A U {c} — R that is continuous at c by defining 


C for x=c, 
PISS a for x€A. 


To see this, observe that if lim G exists and equals C, then lim g must also exist and 
x—-c xe 
equal C. 


5.1.8 Examples (a) The function g(x) := sin(1/x) for x 4 0 (see Figure 4.1.3) does 
not have a limit at x = 0 (see Example 4.1.10(c)). Thus there is no value that we can assign 
at x = 0 to obtain a continuous extension of g at x = 0. 


(b) Let f(x) := xsin (1/x) for x 4 0. (See Figure 5.1.3.) It was seen in Example 4.2.8(f) 
that lim (x sin(1/x)) = 0. Therefore it follows from Remark 5.1.7(a) that if we define 


F:R—R by 


0 for x=0, 
he eras for x #0, 


then F is continuous at x = 0. 
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Figure 5.1.3 Graph of f(x) = x sin(1/x) (x #0) 


Exercises for Section 5.1 


10. 
11. 


12. 


13. 


Prove the Sequential Criterion 5.1.3. 
Establish the Discontinuity Criterion 5.1.4. 


Let a < b < c. Suppose that f is continuous on [a, b], that g is continuous on [b, c], and that 


l f(b) = g(b). Define h on [a, c] by h(x) := f(x) for x € [a,b] and h(x) := g(x) for x € [b,c]. 


Prove that / is continuous on [a, c]. 


If x € R, we define [x] to be the greatest integer n € Z such that n < x. (Thus, for example, 
[8.3] = 8, [z] = 3,[ — x] = —4.) The function x +> [x] is called the greatest integer function. 
Determine the points of continuity of the following functions: 

@ f(x) = DI, (b) g(x) := xix, 

(c) A(x) := [sin x], (d k(x) :=[1/x] (x40). 


Let f be defined for all x € R, x 4 2, by f(x) = (x? + x — 6)/(x — 2). Can f be defined at 
x = 2 in such a way that f is continuous at this point? 


LetA C Rand letf: A — R be continuous at a point c € A. Show that for any ¢ > 0, there exists 
a neighborhood V;(c) of c such that if x,y € A N Vs(c), then | f(x) —f(y)| <e. 


Let f : R — R be continuous at c and let f(c) > 0. Show that there exists a neighborhood V5(c) 
of c such that if x € Vs(c), then f(x) > 0. 


Let f : R — R be continuous on R and let S := {x € R : f(x) = 0} be the “zero set” of f. If 
(xn) is in S and x = lim(x,), show that x € S. 


Let A C BCR, let f : B— R and let g be the restriction of f to A (that is, g(x) = f(x) for 

x € A). 

(a) If fis continuous at c € A, show that g is continuous at c. 

(b) Show by example that if g is continuous at c, it need not follow that f is continuous 
at c. 


Show that the absolute value function f(x) := |x| is continuous at every point c € R. 


Let K > 0 and let f : R — R satisfy the condition | f(x) — f(y)| < K|x — y| for all x,y € R. 
Show that f is continuous at every point c € R. 


Suppose that f : R — R is continuous on R and that f (r) = 0 for every rational number r. Prove 
that f(x) = 0 for all x € R. 


Define g : R > R by g(x) := 2x for x rational, and g(x) := x + 3 for x irrational. Find all 
points at which g is continuous. 
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14. Let A := (0,00) and let k : A — R be defined as follows. For x € A, x irrational, we define 
k(x) = 0; for x € A rational and of the form x = m/n with natural numbers m, n having no 
common factors except 1, we define k(x) := n. Prove that k is unbounded on every open interval 
in A. Conclude that k is not continuous at any point of A. (See Example 5.1.6(h).) 

15. Let f : (0,1) — R be bounded but such that lim f does not exist. Show that there are two 
sequences (x,) and (y,,) in (0, 1) with lim(x,,) = 0 = lim(y,,), but such that (f(x,)) and (f(y,)) 


exist but are not equal. 


Section 5.2 Combinations of Continuous Functions 


Let A C Rand let fand g be functions that are defined on A to R and let b € R. In Definition 
4.2.3 we defined the sum, difference, product, and multiple functions denoted by 
f+2,f—8, fg, bf. In addition, if h : A — R is such that h(x) 4 0 for all x € A, then 
we defined the quotient function denoted by f/h. 

The next result is similar to Theorem 4.2.4, from which it follows. 


5.2.1 Theorem Let A C R, let f and g be functions on A to R, and let b € R. Suppose 
that c € A and that f and g are continuous at c. 
(a) Then f +g, f — 8, fg, and bf are continuous at c. 


(b) Ifh:A— R is continuous at c € A and if h(x) 4 0 for all x € A, then the quotient 
f/h is continuous at c. 


Proof. If c €A is not a cluster point of A, then the conclusion is automatic. Hence we 
assume that c is a cluster point of A. 
(a) Since f and g are continuous at c, then 
f(c)=lim f and g(c) = lim g. 
Hence it follows from Theorem 4.2.4(a) that 
(f + 8)(¢) = f(c) + g(c) = lim (f + 8). 


Therefore f + g is continuous at c. The remaining assertions in part (a) are proved in a 
similar fashion. 

(b) Since c € A, then h(c) # 0. But since h(c) = lim A, it follows from Theorem 4.2.4(b) 
that XC 


limf 
f Fl) e a F 
Es <= = — ] — 
a) = Ale) limh xe \h 
Therefore f/h is continuous at c. Q.E.D. 


The next result is an immediate consequence of Theorem 5.2.1, applied to every point 
of A. However, since it is an extremely important result, we shall state it formally. 


5.2.2 Theorem Let A C R, let f and g be continuous on A to R, and let b € R. 


(a) The functions f + g, f — g, fg, and bf are continuous on A. 


(b) If h:A— R is continuous on A and h(x) £ 0 for x € A, then the quotient f/h is 
continuous on A. 
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Remark To define quotients, it is sometimes more convenient to proceed as follows. If 
g:A—R, let A; := {x € A: g(x) # 0}. We can define the quotient f/g on the set A, by 


(1) Eo = for x€A. 

If ọ is continuous at a point c € Aj, it is clear that the restriction g, of g to A, is also 
continuous at c. Therefore it follows from Theorem 5.2.1(b) applied to g, that f/g, 
is continuous at c € A. Since (f/¢)(x) = (f/@,)(x) for x € A, it follows that f/ọ is 
continuous at c € A;. Similarly, if f and gy are continuous on A, then the function f/ọ, 
defined on A, by (1), is continuous on Aj. 


5.2.3 Examples (a) Polynomial functions. 

If p is a polynomial function, so that p(x) = anx” + an-1xX"7! +--+ + a,x + ao for all 
x € R, then it follows from Example 4.2.5(f) that p(c) = limp for any c € R. Thus a 
polynomial function is continuous on R. ag 


(b) Rational functions. 

If p and q are polynomial functions on R, then there are at most a finite number 
Q1,...,Qm Of real roots of q. If x ¢ {a1,...,@m} then g(x) 4 0 so that we can define the 
rational function r by 


r(x) eG) for xé {a,...,Qm}. 


In other words, r is continuous at c. Since c is any real number that is not a root of g, we 
infer that a rational function is continuous at every real number for which it is defined. 


(c) We shall show that the sine function sin is continuous on R. 
To do so we make use of the following properties of the sine and cosine functions. 
(See Section 8.4.) For all x, y, z € R we have: 
|sin z| < |z|, |cos z| < 1, 
sin x — sin y = 2 sin [Hx — y)]cos[4(x + y)]. 
Hence if c € R, then we have 
|sin x — sin c| < 2- }|x— ceļ- 1 = |x — cl. 
Therefore sin is continuous at c. Since c € R is arbitrary, it follows that sin is continuous 
on R. 
(d) The cosine function is continuous on R. 
We make use of the following properties of the sine and cosine functions. For all 
x,y,z E€ R we have: 
|sin z| < |z|, [sin z| < 1, 
cos x — cos y = —2 sin|+(x + y)]sin[4(x — y)]. 
Hence if c € R, then we have 
|cos x — cos e| < 2- 1 -4e — x| = |x — cl. 
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Therefore cos is continuous at c. Since c € R is arbitrary, it follows that cos is continuous 
on R. (Alternatively, we could use the relation cos x = sin(x + 2/2).) 
(e) The functions tan, cot, sec, csc are continuous where they are defined. 

For example, the cotangent function is defined by 


cos x 
cot x := 


sin x 
provided sin x Æ 0 (that is, provided x 4 nz,n € Z). Since sin and cos are continuous on 
R, it follows (see the Remark before Example 5.2.3) that the function cot is continuous 
on its domain. The other trigonometric functions are treated similarly. 


5.2.4 Theorem LetA CR, letf : A — R, and let |f| be defined by |f\(x) := |f (x)| for 
xEA. 


(a) If f is continuous at a point c € A, then |f| is continuous at c. 
(b) Iff is continuous on A, then |f| is continuous on A. 


Proof. This is an immediate consequence of Exercise 4.2.14. Q.E.D. 


5.2.5 Theorem LetA C R, let f : A — R, and let f(x) > 0 for all x € A. We let \/f be 
defined for x € A by (\/f)(x) := yf (x). 


(a) Iff is continuous at a point c € A, then \/f is continuous at c. 
(b) If f is continuous on A, then \/f is continuous on A. 


Proof. This is an immediate consequence of Exercise 4.2.15. Q.E.D. 


Composition of Continuous Functions 


We now show that if the function f : A — R is continuous at a point c and if g : B — R is 
continuous at b = f (c), then the composition g o f is continuous at c. In order to assure that 
gof is defined on all of A, we also need to assume that f(A) C B. 


5.2.6 Theorem Let A, B C Rand letf : A — R and g : B > R be functions such that 
f(A) C B. Iff is continuous at a point c € A and g is continuous at b = f(c) € B, then the 
composition gof : A — R is continuous at c. 


Proof. Let W be an é-neighborhood of g(b). Since g is continuous at b, there is a 
ô-neighborhood V of b = f (c) such that if y € BN V then g(y) € W. Since fis continuous 
at c, there is a y-neighborhood U of c such that if x € ANU, then f(x) € V. (See 
Figure 5.2.1.) Since f(A) C B, it follows that if x € ANU, then f(x) € BA V so that 
gof(x) = g(f(x)) € W. But since W is an arbitrary -neighborhood of g(b), this implies 
that g o f is continuous at c. Q.E.D. 


5.2.7 Theorem Let A,B C R, let f : A — R be continuous on A, and let g : B > R be 
continuous on B. If f(A) C B, then the composite function g o f : A > R is continuous on A. 


Proof. The theorem follows immediately from the preceding result, if f and g are 
continuous at every point of A and B, respectively. Q.E.D. 
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Figure 5.2.1 The composition of f and g 


Theorems 5.2.6 and 5.2.7 are very useful in establishing that certain functions are 
continuous. They can be used in many situations where it would be difficult to apply the 
definition of continuity directly. 


5.2.8 Examples (a) Let g,(x) := |x| for x € R. It follows from the Triangle Inequality 
that 


li) = ail < [x= el 
for all x,c € R. Hence g, is continuous at c € R. If f : A > R is any function that is 
continuous on A, then Theorem 5.2.7 implies that g, o f = |f| is continuous on A. This 
gives another proof of Theorem 5.2.4. 

(b) Let g(x) := yx for x > 0. It follows from Theorems 3.2.10 and 5.1.3 that go is 
continuous at any number c > 0. If f : A > R is continuous on A and if f(x) > 0 for all 
x € A, then it follows from Theorem 5.2.7 that g, o f = yf is continuous on A. This gives 
another proof of Theorem 5.2.5. 
(c) Let g3(x) := sin x for x € R. We have seen in Example 5.2.3(c) that g3 is continuous 
on R. If f : A — R is continuous on A, then it follows from Theorem 5.2.7 that g, of is 
continuous on A. 

In particular, if f(x) := 1/x for x 4 0, then the function g(x) := sin(1/x) is continu- 
ous at every point c Æ 0. [We have seen, in Example 5.1.8(a), that g cannot be defined at 
0 in order to become continuous at that point.] 


Exercises for Section 5.2 


1. Determine the points of continuity of the following functions and state which theorems are used 
in each case. 


2 42x41 
@ Sa) =E (wer), taa GEO: 
(©) h(x) syd isin (x #0), (d) k(x) :=cosV1+x? (x ER). 
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2. Show that if f : A — R is continuous on A C R and if n € N, then the function f” defined by 
f"(x) = (f(x))", for x € A, is continuous on A. 


3. Give an example of functions fand g that are both discontinuous at a point cin R such that (a) the 
sum f + g is continuous at c, (b) the product fg is continuous at c. 
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4. Let x + [x] denote the greatest integer function (see Exercise 5.1.4). Determine the points of 
continuity of the function f(x) := x — [x], x € R. 


5. Let g be defined on R by g(1) := 0, and g(x) := 2 if x # 1, and let f(x) := x + 1 for all x € R. 
Show that lim gof #(gof)(0). Why doesn’t this contradict Theorem 5.2.6? 

6. Let f, g be defined on R and let c € R. Suppose that lim f = b and that g is continuous at b. 
Show that lim g of = g(b). (Compare this result with Theorem 5.2.7 and the preceding 
exercise.) na 


7. Give an example of a function f : [0, 1] — R that is discontinuous at every point of [0, 1] but 
such that |f| is continuous on [0, 1]. 


8. Let Jf, g be continuous from R to R, and suppose that f(r) = g(r) for all rational numbers r. Is it 
true that f(x) = g(x) for all x € R? 


9. Let h:R — R be continuous on R satisfying h(m/2") = 0 for all m € Z,n € N. Show that 
h(x) = 0 for all x € R. 


10. Letf : R — R be continuous on R, and let P := {x € R : f(x) > 0}. If c € P, show that there 
exists a neighborhood V3(c) C P. 


11. If fand g are continuous on R, let S := {x € R: f(x) > g(x)}. If (sn) C S and lim(s,) = s, 
show that s € S. 


12. A function f : R —> R is said to be additive if f(x + y) = f(x) + f(y) for all x, y in R. Prove 


that if f is continuous at some point xo, then it is continuous at every point of R. (See 
Exercise 4.2.12.) 


13. Suppose that fis a continuous additive function on R. If c := f(1), show that we have f(x) = cx 
for all x € R. [Hint: First show that if r is a rational number, then f(r) = cr.] 


14. Let g : R — R satisfy the relation g(x + y) = g(x) g(y) for all x, y in R. Show that if g is 
continuous at x = 0, then g is continuous at every point of R. Also if we have g(a) = 0 for some 
a € R, then g(x) = 0 for all x € R. 


15. Letf,g : R — R be continuous at a point c, and let A(x) := sup{ f(x), g(x)} for x € R. Show 
that h(x) =4 (f(x) + 8(x)) +3|f(x) — g(x)| for all x€ R. Use this to show that h is 
continuous at c. 


Section 5.3 Continuous Functions on Intervals 


Functions that are continuous on intervals have a number of very important properties that 
are not possessed by general continuous functions. In this section, we will establish some 
deep results that are of considerable importance and that will be applied later. Alternative 
proofs of these results will be given in Section 5.5. 


5.3.1 Definition A function f : A — R is said to be bounded on A if there exists a 
constant M > 0 such that | f(x)| < M for all x € A. 


In other words, a function is bounded on a set if its range is a bounded set in R. To say 
that a function is not bounded on a given set is to say that no particular number can serve 
as a bound for its range. In exact language, a function fis not bounded on the set A if given 
any M > 0, there exists a point xy € A such that | f(xy)| > M. We often say that f is 
unbounded on A in this case. 

For example, the function f defined on the interval A := (0,00) by f(x) := 1/x is not 
bounded on A because for any M > 0 we can take the point xy := 1/(M + 1) inA to get 
f (xu) = 1/xu = M + 1 > M. This example shows that continuous functions need not be 
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bounded. In the next theorem, however, we show that continuous functions on a certain 
type of interval are necessarily bounded. 


5.3.2 Boundedness Theorem’ Let I := [a,b] be a closed bounded interval and let 
f:I— R be continuous on I. Then f is bounded on I. 


Proof. Suppose that fis not bounded on 7. Then, for any n € N there is a number x, € J 
such that | f(x,)| > n. Since J is bounded, the sequence X := (xn) is bounded. Therefore, 
the Bolzano-Weierstrass Theorem 3.4.8 implies that there is a subsequence X’ = (X»,) of X 
that converges to a number x. Since J is closed and the elements of X’ belong to J, it follows 
from Theorem 3.2.6 that x € I. Then fis continuous at x, so that (f(xp,)) converges to f(x). 
We then conclude from Theorem 3.2.2 that the convergent sequence (f(x,,)) must be 
bounded. But this is a contradiction since 


\f(%,)| >n Sr for reN. 


Therefore the supposition that the continuous function f is not bounded on the closed 
bounded interval J leads to a contradiction. Q.E.D. 


To show that each hypothesis of the Boundedness Theorem is needed, we can 
construct examples that show the conclusion fails if any one of the hypotheses is relaxed. 


G) The interval must be bounded. The function f(x) := x for x in the unbounded, 
closed interval A := [0, 00) is continuous but not bounded on A. 


Gi) The interval must be closed. The function g(x) := 1/x for x in the half-open 
interval B := (0, 1] is continuous but not bounded on B. 


(iii) The function must be continuous. The function / defined on the closed interval 
C := [0, 1] by A(x) := 1/x for x € (0, 1] and A(0) := 1 is discontinuous and unbounded 
on C. 


The Maximum-Minimum Theorem 


5.3.3 Definition LetA C Randletf : A — R. We say that f has an absolute maximum 
on A if there is a point x* € A such that 


f(x*) > f(x) foral x€A. 
We say that f has an absolute minimum on A if there is a point x, € A such that 
f(x.) <f(x) forall x EA. 


We say that x“ is an absolute maximum point for f on A, and that x, is an absolute 
minimum point for f on A, if they exist. 


We note that a continuous function on a set A does not necessarily have an absolute 
maximum or an absolute minimum on the set. For example, f(x) := 1/x has neither an 
absolute maximum nor an absolute minimum on the set A := (0, 00). (See Figure 5.3.1.) 
There can be no absolute maximum for f on A since fis not bounded above on A, and there 
is no point at which f attains the value 0 = inf{ f(x) : x € A}. The same function has 


This theorem, as well as 5.3.4, is true for an arbitrary closed bounded set. For these developments, see Sections 
11.2 and 11.3. 
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1 2 
Figure 5.3.1 The function Figure 5.3.2 The function 
f(x) =1/x (x>0) a(x) =x* (|x| < 1) 


neither an absolute maximum nor an absolute minimum when it is restricted to the set 
(0, 1), while it has both an absolute maximum and an absolute minimum when it is 
restricted to the set [1, 2]. In addition, f(x) = 1/x has an absolute maximum but no 
absolute minimum when restricted to the set [1, oo), but no absolute maximum and no 
absolute minimum when restricted to the set (1, œo). 

It is readily seen that if a function has an absolute maximum point, then this point is 
not necessarily uniquely determined. For example, the function g(x) := x? defined for 
x € A := [—1, +1] has the two points x = +1 giving the absolute maximum on A, and the 
single point x = 0 yielding its absolute minimum on A. (See Figure 5.3.2.) To pick an 
extreme example, the constant function A(x) := 1 for x € R is such that every point of R is 
both an absolute maximum and an absolute minimum point for A. 


5.3.4 Maximum-Minimum Theorem Let I := [a,b] be a closed bounded interval and 
let f : I — R be continuous on I. Then f has an absolute maximum and an absolute 
minimum on I. 


Proof. Consider the nonempty setf(/) := {f(x) : x € I} of values of fon Z. In Theorem 5.3.2 
it was established that f(J) is a bounded subset of R. Let s* := sup f(Z) and s, := inf f(/). 
We claim that there exist points x* and x, in J such that s* = f(x*) and s, = f(x,). We will 
establish the existence of the point x*, leaving the proof of the existence of x, to the reader. 

Since s* = sup f(Z), if n € N, then the number s* — 1/n is not an upper bound of the 
set f(I). Consequently there exists a number x, € J such that 


1 
(1) s*—-~<f(x) <” forall neN. 
n 


Since J is bounded, the sequence X := (x,) is bounded. Therefore, by the Bolzano- 
Weierstrass Theorem 3.4.8, there is a subsequence X’ = (X,,) of X that converges to some 
number x". Since the elements of X’ belong to / = [a, b], it follows from Theorem 3.2.6 that 
x* € I. Therefore f is continuous at x* so that lim(f(xp,)) = f(x*). Since it follows from 
(1) that 


1 
S —— < f(Xn) <s* forall reN, 
n, 
we conclude from the Squeeze Theorem 3.2.7 that lim(f(xp,)) = s*. Therefore we have 


f(x") = lim(f(%,)) = 8° = sup f(Z). 


We conclude that x* is an absolute maximum point of f on J. Q.E.D. 
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The next result is the theoretical basis for locating roots of a continuous function by 
means of sign changes of the function. The proof also provides an algorithm, known as the 
Bisection Method, for the calculation of roots to a specified degree of accuracy and can be 
readily programmed for a computer. It is a standard tool for finding solutions of equations 
of the form f(x) = 0, where fis a continuous function. An alternative proof of the theorem 
is indicated in Exercise 5.3.11. 


5.3.5 Location of Roots Theorem Let I = [a, b] and let f : I — R be continuous on 1. If 
f(a) <0 < f(b), or if f(a) > 0 > f(b), then there exists a number c € (a,b) such that 


f(c) =0. 


Proof. We assume that f(a) < 0 < f(b). We will generate a sequence of intervals by 
successive bisections. Let J; := [a1, b1], where a, := a,b, := b, and let p be the midpoint 
pı =4(a1 + bı). If f(p,) = 0, we take c := p; and we are done. If f(p,) Æ 0, then either 
f(p1) > 0 or f(p;) < 0. If f(p,) > 0, then we set a := a), by := pı, while if f(p,) < 0 


then we set az := p4, b2 := b1. In either case, we let J2 := [a2, b2]; then we have J) C I, and 
f(a) < 0, f(b2) > 0. 
We continue the bisection process. Suppose that the intervals [),In,...,/% have 


been obtained by successive bisection in the same manner. Then we have f(a,) < 0 and 
f (bx) > 0, and we set py := 5 (ax + bx). If f(p,) = 0, we take c := pẹ and we are done. 
If f(py) > 0, we set ak+ı := ak, bk+1 := py, while if f(p,) <0, we set ak}ı := 
Pry Ok+1 := bk. In either case, we let Zk}1 := [ak+1, bk+1]; then In41 CI, and 
f(ak+ı) < 0, f(bk+1) > 0. 

If the process terminates by locating a point p„ such that f(p,,) = 0, then we are done. 
If the process does not terminate, then we obtain a nested sequence of closed bounded 
intervals I,, := [a,,b,| such that for every n € N we have 


flan) <0 and f(b,) > 0. 


Furthermore, since the intervals are obtained by repeated bisection, the length of J, is 
equal to by, — an = (b—a)/2"~'. It follows from the Nested Intervals Property 2.5.2 
that there exists a point c that belongs to J, for all n € N. Since a, < c < b, for all 
n € N and lim(b, — an) = 0, it follows that lim(a,) = c = lim(5,). Since f is continuous 
at c, we have 


tim (f(an)) = f(c) = lim (f(bn)). 


The fact that f(a,) < 0 for all n € N implies that f(c) = lim (f(a,)) < 0. Also, the fact 
that f(b„) > 0 for all n € N implies that f(c) = lim (f(b,)) > 0. Thus, we conclude that 
f(c) =0. Consequently, c is a root of f. Q.E.D. 


The following example illustrates how the Bisection Method for finding roots is 
applied in a systematic fashion. 


5.3.6 Example The equation f(x) = xe* — 2 = 0 has a root c in the interval [0, 1], 
because fis continuous on this interval and f(0) = —2 < 0 and f(1) = e — 2 > 0. Using a 
calculator we construct the following table, where the sign of f ( p„) determines the interval 
at the next step. The far right column is an upper bound on the error when p, is used to 
approximate the root c, because we have 


Pn-el < 5 (bn — An) = 1/2". 
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We will find an approximation p,, with error less than 107°. 


n an bn Pn fF (Pn) 5 (bn Yr an) 
1 0 1 5 —1.176 2 

2 .5 1 .15 —.412 25 

3 WD 1 875 +.099 125 

4 75 875 .8125 —.169 .0625 

5 8125 875 84375 —.0382 03125 

6 84375 875 859375 +.0296 015625 
7 84375 859375 .8515625 = .0078125 


We have stopped at n = 7, obtaining c ~ p, = .8515625 with error less than .0078125. 
This is the first step in which the error is less than 107”. The decimal place values of 
pı past the second place cannot be taken seriously, but we can conclude that 
843 < c < .860. 


Bolzano’s Theorem 


The next result is a generalization of the Location of Roots Theorem. It assures us that a 
continuous function on an interval takes on (at least once) any number that lies between 
two of its values. 


5.3.7 Bolzano’s Intermediate Value Theorem Let I be an interval and let f : I — R be 
continuous on I. Ifa,b € Land ifk € R satisfies f(a) < k < f(b), then there exists a point 
c € I between a and b such that f(c) = k. 


Proof. Suppose that a < band let g(x) := f(x) — k; then g(a) < 0 < g(b). By the Location 
of Roots Theorem 5.3.5 there exists a point c with a < c < b such that 0 = g(c) =f(c) — k. 
Therefore f(c) = k. 

If b <a, let A(x) := k — f(x) so that A(b) < 0 < h(a). Therefore there exists a point c 
with b < c < a such that 0 = h(c) = k —f(c), whence f(c) =k. QED. 


5.3.8 Corollary Let I = [a,b] be a closed, bounded interval and let f :I— R be 
continuous on I. If k € R is any number satisfying 


inf f(1) < k < supf(J), 


then there exists a number c € I such that f(c) =k. 


Proof. It follows from the Maximum-Minimum Theorem 5.3.4 that there are points c, 
and c* in J such that 


inf f) = flex) < k < f(c) = sup f (1). 


The conclusion now follows from Bolzano’s Theorem 5.3.7. Q.E.D. 


The next theorem summarizes the main results of this section. It states that the image 
of a closed bounded interval under a continuous function is also a closed bounded interval. 
The endpoints of the image interval are the absolute minimum and absolute maximum 
values of the function, and the statement that all values between the absolute minimum 
and the absolute maximum values belong to the image is a way of describing Bolzano’s 
Intermediate Value Theorem. 
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5.3.9 Theorem Let I be a closed bounded interval and let f : I — R be continuous on I. 
Then the set f(1) := {f (x) : x € I} is a closed bounded interval. 


Proof. If we let m := inf f(J) and M := sup f (I), then we know from the Maximum- 
Minimum Theorem 5.3.4 that m and M belong to f (J). Moreover, we have f(T) C |m, M]. If 
k is any element of [m, M], then it follows from the preceding corollary that there exists a 
point c € I such that k = f(c). Hence, k € f(I) and we conclude that [m, M] C f(I). 
Therefore, f(D is the interval [m, M]. Q.E.D. 


Warning If J := [a,b] is an interval and f : J — R is continuous on J, we have proved 
that f (D is the interval [m, M]. We have not proved (and it is not always true) that f(D) is the 
interval [f(a), f(b)]. (See Figure 5.3.3.) 


Figure 5.3.3 f(I) = |m,M] 


The preceding theorem is a “preservation” theorem in the sense that it states that the 
continuous image of a closed bounded interval is a set of the same type. The next theorem 
extends this result to general intervals. However, it should be noted that although the 
continuous image of an interval is shown to be an interval, it is not true that the image 
interval necessarily has the same form as the domain interval. For example, the continuous 
image of an open interval need not be an open interval, and the continuous image of an 
unbounded closed interval need not be a closed interval. Indeed, if f(x) := 1/(x? + 1) for 
x € R, then f is continuous on R [see Example 5.2.3(b)]. It is easy to see that if 
I := (-1,1), then f(7;) = (5, 1], which is not an open interval. Also, if 73 := [0, 00), 
then f(/2) = (0, 1], which is not a closed interval. (See Figure 5.3.4.) 


Figure 5.3.4 Graph of f(x) = 1/(x7 +1) (x€R) 


140 CHAPTER 5 CONTINUOUS FUNCTIONS 


To prove the Preservation of Intervals Theorem 5.3.10, we will use Theorem 2.5.1 
characterizing intervals. 


5.3.10 Preservation of Intervals Theorem Let I be an interval and let f : I — R be 
continuous on I. Then the set f(D is an interval. 


Proof. Let a, ß € f(I) with a < $; then there exist points a,b € I such that a = f(a) 
and 6 = f(b). Further, it follows from Bolzano’s Intermediate Value Theorem 5.3.7 
that if k € (a, f) then there exists a number c € I with k = f(c) € f(I). Therefore 
læ, 6] C f(I), showing that f(I) possesses property (1) of Theorem 2.5.1. Therefore 
fC) is an interval. Q.E.D. 


Exercises for Section 5.3 


1. Let J := [a,b] and let f : 7 > R be a continuous function such that f(x) > 0 for each x in Z. 
Prove that there exists a number a > 0 such that f(x) > «æ for all x € Z. 


2. Let J := [a,b] and let f : 1 — Rand g : I — R be continuous functions on 7. Show that the set 
E := {x €1: f(x) = g(x)} has the property that if (x,) C E and x, — xo, then xo € E. 


3. Let J := [a,b] and let f : I — R be a continuous function on Z such that for each x in J there 
exists y in Z such that | f(y)| < 4|f(x)|. Prove there exists a point c in 7 such that f(c) = 0. 


4. Show that every polynomial of odd degree with real coefficients has at least one real root. 


5. Show that the polynomial p(x) := x* + 7x? — 9 has at least two real roots. Use a calculator to 
locate these roots to within two decimal places. 


6. Let fbe continuous on the interval [0, 1] to R and such that f(0) = f(1). Prove that there exists 
a point c in [0, } such that f(c) = f (c + 5). [Hint: Consider g(x) = f(x) — f (x + 4).] Conclude 
that there are, at any time, antipodal points on the earth’s equator that have the same 
temperature. 


7. Show that the equation x = cos x has a solution in the interval [0,7/2]. Use the Bisection 
Method and a calculator to find an approximate solution of this equation, with error less than 
107°. 

8. Show that the function f(x) := 21n x + \/x — 2 has root in the interval [1, 2], Use the Bisection 
Method and a calculator to find the root with error less than 107°. 


9. (a) The function f(x) := (x — 1)(x — 2)(x — 3)(x — 4)(x — 5) has five roots in the interval 
[0, 7]. If the Bisection Method is applied on this interval, which of the roots is located? 
(b) Same question for g(x) := (x — 2)(x — 3) (x — 4)(x — 5)(x — 6) on the interval [0, 7]. 


10. If the Bisection Method is used on an interval of length | to find p, with error | p, — c| < 1075, 
determine the least value of n that will assure this accuracy. 


11. Let J := [a,b], let f : 1 — R be continuous on 7, and assume that f(a) < 0, f(b) > 0. Let 
W := {x €1: f(x) < 0}, and let w := sup W. Prove that f(w) = 0. (This provides an alter- 
native proof of Theorem 5.3.5.) 


12. Let J := [0,7/2] and let f : I — R be defined by f(x) := sup{x?, cos x} for x € I. Show there 
exists an absolute minimum point xo € Z for f on J. Show that xo is a solution to the equation 


cos x = x. 


13. Suppose that f : R — R is continuous on R and that lim f = 0 and lim f = 0. Prove that fis 
bounded on R and attains either a maximum or minimum on R. Give an example to show that 
both a maximum and a minimum need not be attained. 


14. Letf : R — R be continuous on R and let 6 € R. Show that if xo € R is such that f (xo) < £, 
then there exists a ô-neighborhood U of xo such that f(x) < £ for all x € U. 
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15. Examine which open [respectively, closed] intervals are mapped by f(x) := x? for x € R onto 
open [respectively, closed] intervals. 


16. Examine the mapping of open [respectively, closed] intervals under the functions g(x) := 
1/(x? + 1) and h(x) := xX? for x € R. 


17. Iff : [0,1] — R is continuous and has only rational [respectively, irrational] values, must f be 
constant? Prove your assertion. 


18. LetZ := [a,b] and letf : I + R be a (not necessarily continuous) function with the property that 
for every x € J, the function f is bounded on a neighborhood V; (x) of x (in the sense of 
Definition 4.2.1). Prove that f is bounded on /. 


19. LetJ := (a,b) and let g : J — R be a continuous function with the property that for every x € J, 
the function g is bounded on a neighborhood V5, (x) of x. Show by example that g is not 
necessarily bounded on J. 


Section 5.4 Uniform Continuity 


Let A C R and let f : A — R. Definition 5.1.1 states that the following statements are 
equivalent: 


(i) fis continuous at every point u € A; 
Gi) given e > 0 and u € A, there is a 5(¢,u) > 0 such that for all x such that x € A 
and |x — u| < (e, u), then | f(x) —f(u)| < e. 


The point we wish to emphasize here is that 5 depends, in general, on both ¢ > 0 and 
u € A. The fact that 5 depends on u is a reflection of the fact that the function f may change 
its values rapidly near certain points and slowly near other points. [For example, consider 
f(x) := sin(1/x) for x > 0; see Figure 4.1.3.] 

Now it often happens that the function fis such that the number 6 can be chosen to be 
independent of the point u € A and to depend only on e. For example, if f(x) := 2x for all 
x € R, then 


f(x) =f) = 2|x — ul, 
and so we can choose 6(¢, u) := ¢/2 for all € > 0 and all u € R. (Why?) 
On the other hand if g(x) := 1/x for x € A := {x € R: x > 0}, then 


u— x 
1 — = ; 
(1) g(x) -= gu) == 
If u € A is given and if we take 
(2) d(e,u) := inf {5u,4 ue}, 


then if |x — u| < (e, u), we have |x — u| < u so that5u < x < łu, whence it follows that 
1/x < 2/u. Thus, if |x — u| <4u, the equality (1) yields the inequality 


(3) I(x) — g(u)| < (2/1°)|x — ul. 
Consequently, if |x — u| < ô(e, u), then (2) and (3) imply that 
g(x) — g(u)| < (2/u°) (Gwe) =e. 


We have seen that the selection of ô(e, u) by the formula (2) “works” in the sense that it 
enables us to give a value of ô that will ensure that |g(x) — g(u)| < ¢ when |x — u| < ô and 
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x,u € A. We note that the value of 5(¢, u) given in (2) certainly depends on the point u € A. 
If we wish to consider all u € A, formula (2) does not lead to one value 6(¢) > 0 that will 
“work” simultaneously for all u > 0, since inf{d(¢, u) : u > 0} = 0. 

In fact, there is no way of choosing one value of 6 that will “work” for all u > 0 for the 
function g(x) = 1/x. The situation is exhibited graphically in Figures 5.4.1 and 5.4.2 
where, for a given e-neighborhood V,($) about += f(2) and V,(2) about 2 = f (4), the 
corresponding maximum values of ô are seen to be considerably different. As u tends to 0, 
the permissible values of 6 tend to 0. 


om 
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5—neighborhood 

Figure 5.4.1 g(x) =1/x (x>0) Figure 5.4.2 g(x) =1/x (x>0) 


L5 2 
-neighborhood 


5.4.1 Definition Let A C R and let f : A — R. We say that fis uniformly continuous 
on A if for each ¢ > 0 there is a 5(e) > 0 such that if x,u € A are any numbers satisfying 
|x — u| < 8(e), then | f(x) —f(u)| < e. 


It is clear that if fis uniformly continuous on A, then it is continuous at every point of A. 
In general, however, the converse does not hold, as is shown by the function g(x) = 1/x on 
the set A := {xE R: x > O}. 

It is useful to formulate a condition equivalent to saying that f is not uniformly 
continuous on A. We give such criteria in the next result, leaving the proof to the reader as 
an exercise. 


5.4.2 Nonuniform Continuity Criteria Let A C R and let f : A — R. Then the follow- 
ing statements are equivalent: 


(i) fis not uniformly continuous on A. 

(ii) There exists an o > 0 such that for every 5 > 0 there are points xs, us in A such that 
|xs — us| < ê and |f (xs) —f(us)| > £o. 

(iii) There exists an & > 0 and two sequences (x„) and (Uy) in A such that lim(x, — 
Un) = 0 and |f (xn) —f(un)| > £o for all n € N. 


We can apply this result to show that g(x) :=1/x is not uniformly continuous 
on A:={xE€R:x>0}. For, if x,:=1/n and u,:=1/(n+1), then we have 
lim(x; — un) = 0, but |g(x,) — g(un)| = 1 for all n € N. 


We now present an important result that assures that a continuous function on a closed 
bounded interval J is uniformly continuous on J. Other proofs of this theorem are given in 
Sections 5.5 and 11.3. 
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5.4.3 Uniform Continuity Theorem Let I be a closed bounded interval and let 
f :1—R be continuous on I. Then f is uniformly continuous on I. 


Proof. If fis not uniformly continuous on J then, by the preceding result, there exists 
& > 0 and two sequences (x,,) and (un) in Z such that |x; — up| < 1/n and | f(x) —f(un)| > 
é for all n € N. Since J is bounded, the sequence (x,) is bounded; by the Bolzano- 
Weierstrass Theorem 3.4.8 there is a subsequence (x,,,) of (x, that converges to an element 
Z. Since I is closed, the limit z belongs to J, by Theorem 3.2.6. It is clear that the 
corresponding subsequence (un,) also converges to z, since 


(Un, _ Z| < (Un, = Xn] + Xn T z|. 


Now if fis continuous at the point z, then both of the sequences (f (xn,)) and (f (up, )) 
must converge to f(z). But this is not possible since 


|f (Xn) —f(Un)| = & 


for all n € N. Thus the hypothesis that fis not uniformly continuous on the closed bounded 
interval J implies that f is not continuous at some point z € J. Consequently, if f is 
continuous at every point of J, then f is uniformly continuous on J. QED. 


Lipschitz Functions 


If a uniformly continuous function is given on a set that is not a closed bounded interval, 
then it is sometimes difficult to establish its uniform continuity. However, there is a 
condition that frequently occurs that is sufficient to guarantee uniform continuity. It is 
named after Rudolf Lipschitz (1832—1903) who was a student of Dirichlet and who worked 
extensively in differential equations and Riemannian geometry. 


5.4.4 Definition Let A C R and let f : A — R. If there exists a constant K > 0 such that 
(4) f(x) —F()| < K|x — ul 


for all x,u € A, then f is said to be a Lipschitz function (or to satisfy a Lipschitz 
condition) on A. 


The condition (4) that a function f : J — R on an interval J is a Lipschitz function can 
be interpreted geometrically as follows. If we write the condition as 


f(x) ~ fl) 


xX—U 


< K, x,uel,xA#~u, 


then the quantity inside the absolute values is the slope of a line segment joining the points 
(x, f(x)) and (u, f(u)). Thus a function f satisfies a Lipschitz condition if and only if the 
slopes of all line segments joining two points on the graph of y = f(x) over J are bounded 
by some number K. 


5.4.5 Theorem Jff : A — R is a Lipschitz function, then f is uniformly continuous on A. 


Proof. If condition (4) is satisfied, then given € > 0, we can take 6 := ¢/K. If x,u € A 
satisfy |x — u| < 6, then 


f(x) -fU < Kg =e. 


Therefore f is uniformly continuous on A. QED. 
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5.4.6 Examples (a) If f(x) := x° on A := [0,5], where b > 0, then 
f(x) =f) = |x + ullx — ul < 2b|x —u| 


for all x, u in [0, b]. Thus f satisfies (4) with K := 2b on A, and therefore f is uniformly 
continuous on A. Of course, since fis continuous and A is a closed bounded interval, this 
can also be deduced from the Uniform Continuity Theorem. (Note that f does not satisfy a 
Lipschitz condition on the interval [0, 0o).) 

(b) Not every uniformly continuous function is a Lipschitz function. 

Let g(x) := yx for x in the closed bounded interval J := [0, 2]. Since g is continuous 
on J, it follows from the Uniform Continuity Theorem 5.4.3 that g is uniformly continuous 
on I. However, there is no number K > 0 such that |g(x)| < K|x| for all x € 7. (Why not?) 
Therefore, g is not a Lipschitz function on J. 

(c) The Uniform Continuity Theorem and Theorem 5.4.5 can sometimes be combined to 
establish the uniform continuity of a function on a set. 

We consider g(x) := ,/x on the set A := [0, 00). The uniform continuity of g on the 
interval J := [0,2] follows from the Uniform Continuity Theorem as noted in (b). If 
J :=[I,00), then if both x, u are in J, we have 


Is@x) = gu) = |Vx - Val = 


|x = u| 
vx + Vu 


Thus g is a Lipschitz function on J with constant K = L, and hence by Theorem 5.4.5, g is 
uniformly continuous on [l, co). Since A=Z7UJ, it follows [by taking (e) := 
inf {1, 6;(e), 6;(€)}] that g is uniformly continuous on A. We leave the details to the 
reader. 


< 4|x— ul]. 


The Continuous Extension Theorem 


We have seen examples of functions that are continuous but not uniformly continuous on 
open intervals; for example, the function f(x) = 1/x on the interval (0, 1). On the other 
hand, by the Uniform Continuity Theorem, a function that is continuous on a closed 
bounded interval is always uniformly continuous. So the question arises: Under what 
conditions is a function uniformly continuous on a bounded open interval? The answer 
reveals the strength of uniform continuity, for it will be shown that a function on (a, b) is 
uniformly continuous if and only if it can be defined at the endpoints to produce a function 
that is continuous on the closed interval. We first establish a result that is of interest in itself. 


5.4.7 Theorem Jff : A — R is uniformly continuous on a subset A of R and if (xn) is a 
Cauchy sequence in A, then (f(xn)) is a Cauchy sequence in R. 


Proof. Let (x,„) be a Cauchy sequence in A, and let ¢ > 0 be given. First choose 6 > 0 such 
that if x, u in A satisfy |x— u| < ô, then |f(x) —f(u)| < €. Since (x,) is a Cauchy 
sequence, there exists H(8) such that |x, — Xm| < 6 for all n,m > H (8). By the choice of ô, 
this implies that for n,m > H(8), we have |f(xn) — f(Xm)| < £. Therefore the sequence 
(f(Xn)) is a Cauchy sequence. QED. 


The preceding result gives us an alternative way of seeing that f(x) := 1/x is not 
uniformly continuous on (0, 1). We note that the sequence given by x, := 1/n in (0, l)isa 
Cauchy sequence, but the image sequence, where f(x,) = n, is not a Cauchy sequence. 
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5.4.8 Continuous Extension Theorem A function f is uniformly continuous on the 
interval (a, b) if and only if it can be defined at the endpoints a and b such that the ex- 
tended function is continuous on [a, b]. 


Proof. (<=) This direction is trivial. 

(=) Suppose fis uniformly continuous on (a, b). We shall show how to extend f to a; 
the argument for b is similar. This is done by showing that lim f(x) = Lexists, and this is 
accomplished by using the sequential criterion for limits. If (x,) i is a sequence in (a, b) with 
lim(x,) = a, then it is a Cauchy sequence, and by the preceding theorem, the sequence 
(f (Xn)) is also a Cauchy sequence, and so is convergent by Theorem 3.5.5. Thus the limit 
lim(f(x,)) = L exists. If (u„) is any other sequence in (a, b) that converges to a, then 
lim(u, — Xn) = a — a = 0, so by the uniform continuity of f we have 


lim(f(Un)) = lim(f (un) —f(%n)) + lim(f(%n)) 
=04+L=L. 


Since we get the same value L for every sequence converging to a, we infer from the 
sequential criterion for limits that f has limit L at a. If we define f(a) := L, then f is 
continuous at a. The same argument applies to b, so we conclude that f has a continuous 
extension to the interval [a, b]. Q.E.D. 


Since the limit of f(x) := sin(1/x) at 0 does not exist, we infer from the Continuous 
Extension Theorem that the function is not uniformly continuous on (0, b] for any b > 0. 
On the other hand, since lim xsin(1/x) = 0 exists, the function g(x) := x sin(1/x) is 


uniformly continuous on (0, b] for all b > 0. 


Approximation’ 


In many applications it is important to be able to approximate continuous functions by 
functions of an elementary nature. Although there are a variety of definitions that can be used 
to make the word “approximate” more precise, one of the most natural (as well as one of the 
most important) is to require that, at every point of the given domain, the approximating 
function shall not differ from the given function by more than the preassigned error. 


5.4.9 Definition A function s : [a, b] — Ris called a step function if [a, b] is the union 
of a finite number of nonoverlapping intervals Z1, I2,..., In such that s is constant on each 
interval, that is, s(x) = cx for all x€ Ik, k = 1, 2,..., n 


Thus a step function has only a finite number of distinct values. 
For example, the function s : [—2, 4] — R defined by 


—2<x<-l, 
-1<x<0, 
i 0<x<}, 
$<x<], 
1<x <3, 
; 3<x<4, 


is a step function. (See Figure 5.4.3.) 


‘The rest of this section can be omitted on a first reading of this chapter. 
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-m 


Figure 5.4.3 Graph of y = s(x) 


We will now show that a continuous function on a closed bounded interval J can be 
approximated arbitrarily closely by step functions. 


5.4.10 Theorem Let I be a closed bounded interval and let f : I — R be continuous on I. 
If e > 0, then there exists a step function s; : I — R such that | f(x) — se(x)| < e for all 
xel. 


Proof. Since (by the Uniform Continuity Theorem 5.4.3) the function f is uniformly 
continuous, it follows that given ¢ > 0 there is a number 4(¢) > 0 such that if x,y € Z and 
|x — y| < 6(e), then | f(x) —f(y)| < e. Let I := |a, b] and let m € N be sufficiently large so 
that A := (b — a)/m < ê(e). We now divide J = [a, b] into m disjoint intervals of length h; 
namely, J, := [a, a+ h], and I, := (a+ (k — 1)h, a + kh) fork = 2, . . . , m. Since the 
length of each subinterval J, ish < 8(e), the difference between any two values of fin J; is 
less than ¢. We now define 


(5) S(x):=f(a+kh) for xelk, k=1,...,m, 


so that s, is constant on each interval 74. (In fact the value of s, on J; is the value of f at the 
right endpoint of J,. See Figure 5.4.4.) Consequently if x € J;, then 


f(x) ci Se(X)| = | f(x) —f(a+kh)| <E. 


Therefore we have | f(x) — s,(x)| < e for all x € Z. Q.E.D. 


y=f(x)+e 


Figure 5.4.4 Approximation by step functions 
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Note that the proof of the preceding theorem establishes somewhat more than was 
announced in the statement of the theorem. In fact, we have proved the following, more 
precise, assertion. 


5.4.11 Corollary Let I := |a, b] be a closed bounded interval and let f : I — R be 
continuous on I. If € > 0, there exists a natural number m such that if we divide I into m 
disjoint intervals I, having length h := (b — a) /m, then the step function s, defined in 
equation (5) satisfies | f(x) — 5.(x)| < e for all x € 1. 


Step functions are extremely elementary in character, but they are not continuous 
(except in trivial cases). Since it is often desirable to approximate continuous functions by 
elementary continuous functions, we now shall show that we can approximate continuous 
functions by continuous piecewise linear functions. 


5.4.12 Definition Let J := |a, b] be an interval. Then a function g : J — R is said to be 
piecewise linear on / if J is the union of a finite number of disjoint intervals 71, .. . , Im, 
such that the restriction of g to each interval J, is a linear function. 


Remark Itis evident that in order for a piecewise linear function g to be continuous on 
I, the line segments that form the graph of g must meet at the endpoints of adjacent 
subintervals I,, In4; (k =1,..., m— 1). 


5.4.13 Theorem Let I be a closed bounded interval and let f : I — R be continuous on L. 
If e > 0, then there exists a continuous piecewise linear function g, :I — R such that 
| f(x) — g,(x)| < e for all x € 1. 


Proof. Since fis uniformly continuous on 7 := |a, b], there is a number 6(¢) > 0 such that 
if x, y € J and |x — y| < (e), then | f(x) — f(y)| < £. Let m € N be sufficiently large so 
that h := (b — a)/m < (e). Divide J = [a, b] into m disjoint intervals of length h, namely, 
let I; = [a, a+ h], and let I, = (a+ (k — 1)h, a + kh] fork = 2, . . . , m. On each interval 
I, we define g, to be the linear function joining the points 


(a+ (k—1)h, f(a+(k—1)h)) and (a+kh, f(a+kh)). 


Then g, is a continuous piecewise linear function on J. Since, for x € I, the value f(x) is 
within ¢ of f(a + (k — 1)h) and f(a + kh), it is an exercise to show that | f(x) — g,(x)| < € 
for all x € J;; therefore this inequality holds for all x € J. (See Figure 5.4.5.) QED. 


av =fx) +e 
¢ 
y = f(x) 


Figure 5.4.5 Approximation by piecewise linear function 
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We shall close this section by stating the important theorem of Weierstrass con- 
cerning the approximation of continuous functions by polynomial functions. As would be 
expected, in order to obtain an approximation within an arbitrarily preassigned € > 0, 
we must be prepared to use polynomials of arbitrarily high degree. 


5.4.14 Weierstrass Approximation Theorem Let I = ja, b] and let f : I — R be a 
continuous function. Ife > 0 is given, then there exists a polynomial function p, such that 
| f(x) — p,(x)| < e for all x € I. 


There are a number of proofs of this result. Unfortunately, all of them are rather 


intricate, or employ results that are not yet at our disposal. (A proof can be found in Bartle, 
ERA, pp. 169-172, which is listed in the References.) 


Exercises for Section 5.4 


1. Show that the function f(x) := 1/x is uniformly continuous on the set A := [a, 00), where ais a 
positive constant. 


2. Show that the function f(x) := 1/x? is uniformly continuous on A := [1, 00), but that it is not 
uniformly continuous on B := (0, oo). 


3. Use the Nonuniform Continuity Criterion 5.4.2 to show that the following functions are not 
uniformly continuous on the given sets. 
(a) f(x):=x°, A:= [0, œ). 
(b) g(x) :=sin(1/x), B := (0,00). 

4. Show that the function f(x) := 1/(1 + x°) for x € R is uniformly continuous on R. 


5. Show that if f and g are uniformly continuous on a subset A of R, then f + g is uniformly 
continuous on A. 


6. Show that if fand g are uniformly continuous on A C R and if they are both bounded on A, then 
their product fg is uniformly continuous on A. 


7. Iff(x) := x and g(x) := sin x, show that both f and g are uniformly continuous on R, but that 
their product fg is not uniformly continuous on R. 


8. Prove that if f and g are each uniformly continuous on R, then the composite function f o g is 
uniformly continuous on R. 


9. If f is uniformly continuous on A C R, and |f(x)| > k > 0 for all x € A, show that 1/f is 
uniformly continuous on A. 


10. Prove that if f is uniformly continuous on a bounded subset A of R, then f is bounded on A. 


11. If g(x) := yx for x € [0, 1], show that there does not exist a constant K such that |g(x)| < 
K|x| for all x € [0, 1]. Conclude that the uniformly continuous g is not a Lipschitz function 
on [0, 1]. 


12. Show that if fis continuous on [0, oo) and uniformly continuous on [a, oo) for some positive 
constant a, then f is uniformly continuous on [0, oo). 


13. LetA C Rand suppose that f : A — R has the following property: for each e > 0 there exists a 
function g, : A — R such that g, is uniformly continuous on A and | f(x) — g,(x)| < e for all 
x € A. Prove that f is uniformly continuous on A. 


14. A function f : R — R is said to be periodic on R if there exists a number p > 0 such that 
f(x +p) =f (x) for all x € R. Prove that a continuous periodic function on R is bounded and 
uniformly continuous on R. 
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15. Let fand g be Lipschitz functions on A. 
(a) Show that the sum f + g is also a Lipschitz function on A. 
(b) Show that if f and g are bounded on A, then the product fg is a Lipschitz function on A. 
(c) Give an example of a Lipschitz function fon [0, oo) such that its square f? is not a Lipschitz 
function. 


16. A function is called absolutely continuous on an interval J if for any e > 0 there exists a ô > 0 
such that for any pair-wise disjoint subintervals [x,, y,], k = 1,2,...,”, of I such that 
© [xx — y| < 6 we have X- |f (xx) —f(;,)| < £. Show that if f satisfies a Lipschitz condition 
on J, then f is absolutely continuous on 7. 


Section 5.5 Continuity and Gauges! 


We will now introduce some concepts that will be used later—especially in Chapters 7 and 
10 on integration theory. However, we wish to introduce the notion of a “gauge” now 
because of its connection with the study of continuous functions. We first define the notion 
of a tagged partition of an interval. 


5.5.1 Definition A partition of an interval J := [a, b] isa collection P = {I),...,1,} of 
non-overlapping closed intervals whose union is [a, b]. We ordinarily denote the intervals 
by I; := [x;-1, xi], where 


a= Xo <i < Xi LX L <L Xn =D. 


The points x; (i = 0,...,n) are called the partition points of P. If a point t; has been 
chosen from each interval J;, for i= 1, . . . , n, then the points t; are called the tags and the 
set of ordered pairs 


P = {(h, ti), dess (In, tn) } 
is called a tagged partition of 7. (The dot signifies that the partition is tagged.) 


The “fineness” of a partition P refers to the lengths of the subintervals in P. Instead of 
requiring that all subintervals have length less than some specific quantity, it is often useful 
to allow varying degrees of fineness for different subintervals J; in P. This is accomplished 
by the use of a “gauge,” which we now define. 


5.5.2 Definition A gauge on /is a strictly positive function defined on J. If ô is a gauge on 
I, then a (tagged) partition P is said to be ô-fine if 


(1) ti € 1; C [ti — ô(ti), ti + 6(t;)] for i=1,...,n. 


We note that the notion of 5-fineness requires that the partition be tagged, so we do not 
need to say “tagged partition” in this case. 


A gauge 6 on an interval Z assigns an interval [t — 6(t), t + 6(t)] to each point ¢ € 7. The 
6-fineness of a partition P requires that each subinterval J; of P is contained in the interval 
determined by the gauge ô and the tag t; for that subinterval. This is indicated by the 
inclusions in (1); see Figure 5.5.1. Note that the length of the subintervals is also controlled 
by the gauge and the tags; the next lemma reflects that control. 


‘This section can be omitted on a first reading of this chapter. 
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4-5 (t;) t t+ 5 (t;) 
Figure 5.5.1 Inclusion (1) 


5.5.3 Lemma Jfa partition P of I := |a, b] is 5-fine and x € I, then there exists a tag t; 
in P such that |x — t;| < 6(t;). 


Proof. If x € I, there exists a subinterval [x;_,,x;] from P that contains x. Since P is 
6-fine, then 


(2) ti — 5(t;) < x1 SKS < ti + 5(t;), 


whence it follows that |x — t;| < &(t;). QED. 


In the theory of Riemann integration, we will use gauges 6 that are constant 
functions to control the fineness of the partition; in the theory of the generalized 
Riemann integral, the use of nonconstant gauges is essential. But nonconstant gauge 
functions arise quite naturally in connection with continuous functions. For, let f : I — 
R be continuous on J and let ¢ >0 be given. Then, for each point ¢ €Z there 
exists 6,(¢) > 0 such that if |x — t| < 6,(t) and x € Z, then | f(x) —f(t)| < €. Since ô, 
is defined and is strictly positive on J, the function ô; is a gauge on I. Later in this 
section, we will use the relations between gauges and continuity to give alternative 
proofs of the fundamental properties of continuous functions discussed in Sections 5.3 
and 5.4. 


5.5.4 Examples (a) If 6 and y are gauges on J := |a, b] and if 0 < 6(x) < y (x) for all 
x € J, then every partition P that is 5-fine is also y-fine. This follows immediately from the 
inequalities 


ti—y(ti) < ti— ôlti) and t+ 8(t;) < ti + y(t) 
which imply that 
ti € [ti — 8(ti), ti + 6(t)| C [4 — ylti), ti +ylt)]| for i=1,...,n. 
(b) If 5, and 52 are gauges on J := [a, b] and if 
6(x) := min{d,(x), 62(x)} forall x eT, 


then ô is also a gauge on J. Moreover, since 6(x) < ô; (x), then every 5-fine partition is 5,- 
fine. Similarly, every ô-fine partition is also 5 -fine. 


(c) Suppose that ô is defined on J := [0, 1] by 


4 if x=0, 
ô(x) := i 10 


$x if 0<x<1. 


Then ô is a gauge on [0, 1]. IFO < ¢ < 1, then [¢ — ô(t), t + ê(£)] = [41,32], which does not 
contain the point 0. Thus, if P is a ô-fine partition of J, then the only subinterval in P that 


contains 0 must have the point 0 as its tag. 
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(d) Let y be defined on Z := [0, 1] by 


if x=Oorx=1, 


zl- 


= 


y(x) := if 0<x<4, 


NIE Ni= 
— 
x 
oe 


if 4<x<1. 


Then y is a gauge on J, and it is an exercise to show that the subintervals in any y-fine 
partition that contain the points 0 or | must have these points as tags. 


Existence of 5-Fine Partitions 


In view of the above examples, it is not obvious that an arbitrary gauge ô admits a 6-fine 
partition. We now use the Supremum Property of R to establish the existence of 5-fine 
partitions. In the exercises, we will sketch a proof based on the Nested Intervals 
Theorem 2.5.2. 


5.5.5 Theorem /f 6 is a gauge defined on the interval [a, b], then there exists a ô-fine 
partition of [a, b]. 


Proof. Let E denote the set of all points x € |a, b] such that there exists a 5-fine partition 
of the subinterval [a, x]. The set E is not empty, since the pair ([a, x], a) is a 6-fine 
partition of the interval [a, x] when x € fa, a+ 5(a)| and x < b. Since E C [a,b], the set 
E is also bounded. Let u := sup E so that a < u < b. We will show that u € E and that 
u= b. 

We claim that u € E. Since u—6(u) < u = sup E, there exists v € E such that 
u—ô(u) <v <u. Let Pı be a 6-fine partition of [a, v] and let Pz := P U (fv, ul, u). 
Then Pr is a 6-fine partition of [a, u], so that u € E. 

If u < b, let w € [a,b] be such that u < w < u + 6(u). If Qı is a ô-fine partition of 
[a, u], we let Q, := Qı U (|u, w], u). Then Q, is a ô-fine partition of [a, w], whence 
w € E. But this contradicts the supposition that u is an upper bound of E. Therefore 
u = b. Q.E.D. 


Some Applications 


Following R. A. Gordon (see his Monthly article in the References), we will now show that 
some of the major theorems in the two preceding sections can be proved by using gauges. 


Alternate Proof of Theorem 5.3.2: Boundedness Theorem. Since f is continuous on J, 
then for each ź € I there exists ô(t) > 0 such that if x € Z and |x-— t| < ô(t), then 
|f(x)- f(| < 1. Thus ô is a gauge on 7. Let {(J;, ¢;)}/_, be a 6-fine partition of 7 and 
let K := max{|f(t;)|: i= 1,...,n}. By Lemma 5.5.3, given any x € I there exists i with 
|x — t;| < 8(t;), whence 


LOS Ife) — fl) + IS 1+ K. 
Since x € I is arbitrary, then f is bounded by 1 + K on J. QED. 
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Alternate Proof of Theorem 5.3.4: Maximum-Minimum Theorem. We will prove the 
existence of x". Let M := sup{f(x) : x € I} and suppose that f(x) < M for all x € I. Since f 
is continuous on J, for each t € J there exists 6(t) > 0 such that if x € J and |x — t| < (t), 
then f(x) < (M + f(t)). Thus ô is a gauge on J, and if {(J;, t;)}/_, is a 6-fine partition of Z, 
we let 


M:= 5max{M +f(ti),...,.M+f(tr)}. 
By Lemma 5.5.3, given any x € J, there exists i with |x — t;| < 6(¢;), whence 
f(x) <3(M+f(ti)) < M. 


Since x € I is arbitrary, then M (< M) is an upper bound for fon J, contrary to the definition 
of M as the supremum of f. Q.E.D. 


Alternate Proof of Theorem 5.3.5: Location of Roots Theorem. We assume that f(t) 4 0 
for all £ € I. Since fis continuous at t, Exercise 5.1.7 implies that there exists 6(t) > 0 such 
that if x € J and |x — t| < 6(f), then f(x) < O if f(D) < 0, and f(x) > Oif f(A) > 0. Then ô is 
a gauge on J and we let {(I;, ti) };_, be a 5-fine partition. Note that for each i, either f(x) < 0 
for all x € [xi-1, x;] or f(x) > 0 for all such x. Since f(xo) = f(a) < 0, this implies that 
f(xı) < 0, which in turn implies that f(x.) < 0. Continuing in this way, we have 
f(b) =f (xn) < 0, contrary to the hypothesis that f(b) > 0. QED. 


Alternate Proof of Theorem 5.4.3: Uniform Continuity Theorem. Let € > 0 be given. 
Since f is continuous at t € 7, there exists 5(t) > 0 such that if x € J and |x — t| < 28(t), 
then | f(x) —f(t)| < 4e. Thus ô is a gauge on 7. If {(I;, ¢;) }/_, is a ô-fine partition of J, 
let 5, := min{8(t),...,5(t,)}. Now suppose that x,u € I and |x — u| < 4,, and choose i 
with |x — t;| < ê(t;). Since 


ju = til < ju = x| + |x = til < ôs + 5(t;) < 26(t;), 
then it follows that 


FŒ F < E — F(t) + [F(4) —F(@)| jetese. 


Therefore, f is uniformly continuous on 7. Q.E.D. 


Exercises for Section 5.5 


1. Let 5 be the gauge on [0, 1] defined by 6( 
(a) Show that P, = {({0, 4], 0), (E, 4], 
(b) Show that P> = {([0, 4], 0), (H, 4] 

2. Suppose that 5; is the gauge defined by 5, (0) := $,5)(1) :=?¢ for t € (0, 1]. Are the partitions 
given in Exercise 1 5,-fine? Note that 4(t) < ô (t) for all ¢ € [0,1]. 

3. Suppose that ô, is the gauge defined by 5,(0) := jj and 42(t) := ĝt for t € (0, 1]. Are the 
partitions given in Exercise 1 ô2-fine? 


0): =} and (t) := 4t for t € (0, 1]. 
$, 1], 3)} is 5-fine. 
14,3 


5 } is not 6-fine. 


4. Let y be the gauge in Example 5.5.4(d). 
(a) If ¢ € (0,4] show that [t — y(t), + y(i] = [4,34] c (0,3. 
(b) If t € (4,1) show that [t — y(t), t+ yd] C (4, 1). 

5. Leta<c<bandletdbea gauge on [a, b]. If P’ is a &-fine partition of [a, c] and if P" is a d-fine 
partition of [c, b], show that P’ U P” is a 6-fine partition of [a, b] having c as a partition point. 
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6. Let a <c < band let 5’ and 5” be gauges on [a, c] and [c, b], respectively. If ô is defined on 


[a, b] by 
(t) if tefa, c), 
(t) := $ ming (c), 8"(c)} if t=c, 
8" (t) if te(c, b], 


then ô is a gauge on [a, b]. Moreover, if P' is a d'-fine partition of [a, c] and P" is a 6’ -fine 
partition of [c, b], then P’ U P” is a tagged partition of [a, b] having c as a partition point. 
Explain why P’ U P” may not be ô-fine. Give an example. 


7. Let 6’ and ô" be as in the preceding exercise and let 5* be defined by 


min{d(t), 4(c—t)} if te l[a, c), 
ô (t) := 4 min{8' (c), 5”(c)} if t=c, 
min{é"(t), $(¢—c)} if te (c, b]. 


Show that ô* is a gauge on [a, b] and that every ô*-fine partition P of [a, b] having c as a partition 
point gives rise to a ô'-fine partition P’ of [a, c] and a ô"-fine partition P” of [c, b] such that 
P=PUP", 
8. Let 5 be a gauge on J := [a,b] and suppose that J does not have a 6-fine partition. 
(a) Letc:= (a + b). Show that at least one of the intervals [a, c] and [c, b] does not have a 
6-fine partition. 
(b) Construct a nested sequence (J,,) of subintervals with the length of Z, equal to (b — a) /2” 
such that 7„ does not have a 6-fine partition. 
(c) Let €€M<,/, and let p E€ N be such that (b — a)/2? < 8(£). Show that J, C [g — 6(&), 
E+ 8(&)], so the pair (hs é) is a 6-fine partition of J,. 
9. Let J := [a,b] and let f : I — R be a (not necessarily continuous) function. We say that f is 
“locally bounded” at c € I if there exists 6(c) > 0 such that f is bounded on ZN [c — 4(c), 
c+ 8(c)]. Prove that if f is locally bounded at every point of Z, then f is bounded on J. 


10. Let J := [a,b] and f : I — R. We say that fis “locally increasing” at c € J if there exists 5(c) > 
O such that fis increasing on Z N [ce — 5(c), c + 6(c)]. Prove that if fis locally increasing at every 
point of J, then f is increasing on J. 


Section 5.6 Monotone and Inverse Functions 


Recall that if A C R, then a function f : A — R is said to be increasing on A if whenever 
x1, X2 E€ A and xı < x2, thenf(x1) < f(x). The function fis said to be strictly increasing 
on A if whenever x1,x2 € A and xı < x2, then f(x) < f(x2). Similarly, g : A — R is said 
to be decreasing on A if whenever x|,x2 E€ A and x; < x2 then g(x1) > g(x2). The 
function g is said to be strictly decreasing on A if whenever x1, X2 € A and x; < x then 
g(x) > g(x). 

If a function is either increasing or decreasing on A, we say that it is monotone on A. 
If f is either strictly increasing or strictly decreasing on A, we say that f is strictly 
monotone on A. 

We note that iff : A — R is increasing on A then g := —f is decreasing on A; similarly 
if g: A — R is decreasing on A then y := —ọ is increasing on A. 
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In this section, we will be concerned with monotone functions that are defined on 
an interval 7 C R. We will discuss increasing functions explicitly, but it is clear that 
there are corresponding results for decreasing functions. These results can either be 
obtained directly from the results for increasing functions or proved by similar 
arguments. 

Monotone functions are not necessarily continuous. For example, if f(x) := 0 for 
x € [0,1] and f(x) := 1 for x € (1, 2], then f is increasing on [0, 2], but fails to be 
continuous at x = 1. However, the next result shows that a monotone function always has 
both one-sided limits (see Definition 4.3.1) in R at every point that is not an endpoint of its 
domain. 


5.6.1 Theorem Let J C R be an interval and let f : I — R be increasing on I. Suppose 
that c € I is not an endpoint of I. Then 


(i) lim f = sup{ f(x): x € 1, x< c}, 
(ii) lim f= inf{ f(x): x €1, x > c}. 


Proof. (i) First note that if x€Z and x < c, then f(x) <f(c). Hence the set 
{f(x) : x €I, x < c}, which is nonvoid since c is not an endpoint of J, is bounded above 
by f(c). Thus the indicated supremum exists; we denote it by L. If e > Ois given, then L — € 
is not an upper bound of this set. Hence there exists y, €T, y, <c such that 
L-é < f(y.) Š L. 

Since f is increasing, we deduce that if 6, := c — y, and if O < c — y < ôs, then y, < 
y < c so that 


L-e < f(y.) < f(y) < L. 


Therefore | f(y) — L| < e when 0 < c—y < ôs. Since ¢ > 0 is arbitrary we infer that (i) 
holds. 
The proof of (ii) is similar. Q.E.D. 


The next result gives criteria for the continuity of an increasing function f at a point c 
that is not an endpoint of the interval on which f is defined. 


5.6.2 Corollary LetI C R be an interval and let f : I — R be increasing on I. Suppose 
that c € I is not an endpoint of I. Then the following statements are equivalent. 


(a) fis continuous at c. 
©) lim f =f(c) = lim f. 
(c) sup{f(x):xE€I, x< c}= fle) = inf{f(x):xE€I, x > c}. 


This follows easily from Theorems 4.3.3 and 5.6.1. We leave the details to the 
reader. 

Let / be an interval and let f : J — R be an increasing function. If a is the left endpoint 
of J, it is an exercise to show that f is continuous at a if and only if 


f(a) =inf{f(x):x€1, a< x} 


or if and only if f(a) = lim | f. Similar conditions apply at a right endpoint, and for 
decreasing functions. oe 
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Figure 5.6.1 The jump of f at c 


Iff : I > R is increasing on Z and if c is not an endpoint of J, we define the jump of f 
at c to be jp(c) := lim, f —lim f. (See Figure 5.6.1.) It follows from Theorem 5.6.1 
that i i 
J(e) = inf{ f(x) : x € I, x > c} — sup{ f(x): x Eel, x< c} 
for an increasing function. If the left endpoint a of I belongs to Z, we define the jump of f at 
a to be j;(a) := lim f — f(a). If the right endpoint b of I belongs to Z, we define the jump 
of f at b to be j;(b) := f(b) — lim f. 


5.6.3 Theorem Let I C R be an interval and let f : I — R be increasing on L. If c € I, 
then f is continuous at c if and only if j;(¢) = 0. 


Proof. If cis not an endpoint, this follows immediately from Corollary 5.6.2. If c € J is 
the left endpoint of J, then f is continuous at c if and only if f(c) = lim f, which is 


equivalent to J(c) = 0. Similar remarks apply to the case of a right endpoint. Q.E.D. 


We now show that there can be at most a countable set of points at which a monotone 
function is discontinuous. 


5.6.4 Theorem LetI C R be an interval and let f : I — R be monotone on I. Then the 
set of points D C I at which f is discontinuous is a countable set. 


Proof. We shall suppose that f is increasing on Z. It follows from Theorem 5.6.3 that 
D={xel :je(x) # 0}. We shall consider the case that J := [a,b] is a closed bounded 
interval, leaving the case of an arbitrary interval to the reader. 

We first note that since f is increasing, then J(e) > 0 for all c € I. Moreover, if 
a< xı <+++X_ <b, then (why?) we have 


(1) Fla) < fla) +je(x1) +- + jen) < FC), 
whence it follows that 
da) +: + je(Xn) < f(b) =f (a). 


(See Figure 5.6.2.) Consequently there can be at most k points in J = [a,b] where 
ip(x) = (f(b) — f(a))/k. We conclude that there is at most one point x € J where 
i(x) = f(b) — f(a); there are at most two points in J where j;(x) > (f(b) — f(a))/2; 
at most three points in J where j-(x) > (f(b) — f(a))/3, and so on. Therefore there is at 
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hoa fi I 
l 
| | 
| | 
A | | 
jsa) 4) | | 
| l ! f(b) - f(a) 
l | l 
kaaf | | | 
| | | 
| | | | 
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yfi | | | | 
| 
A.— | — ——_}—_—— —-——.—__4——~— | 
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a xy X2 x3 x4 b 


Figure 5.6.2. jp(x1) +: tJ¢(%n) < f(b) — f(a) 


most a countable set of points x where jx) > 0. But since every point in D must be 
included in this set, we deduce that D is a countable set. Q.E.D. 


Theorem 5.6.4 has some useful applications. For example, it was seen in Exercise 
5.2.12 that if h : R — R satisfies the identity 
(2) h(x+y) = h(x)+hA(y) forall x,yER, 


and if A is continuous at a single point xo, then / is continuous at every point of R. Thus, if h 
is a monotone function satisfying (2), then h must be continuous on R. [It follows from this 
that h(x) = Cx for all x € R, where C := h(1).] 


Inverse Functions 


We shall now consider the existence of inverses for functions that are continuous on an 
interval J C R. We recall (see Section 1.1) that a function f : J — R has an inverse function 
if and only if fis injective (= one-one); that is, x,y € Z and x 4 y imply that f(x) 4 f(y). 
We note that a strictly monotone function is injective and so has an inverse. In the next 
theorem, we show that if f : I — R is a strictly monotone continuous function, then f 
has an inverse function g on J := f (T) that is strictly monotone and continuous on J. In 
particular, if fis strictly increasing then so is g, and if fis strictly decreasing then so is g. 


5.6.5 Continuous Inverse Theorem Let J CR be an interval and let f : I — R be 
strictly monotone and continuous on I. Then the function g inverse to f is strictly monotone 
and continuous on J := f (I). 


Proof. We consider the case that fis strictly increasing, leaving the case that fis strictly 
decreasing to the reader. 

Since fis continuous and / is an interval, it follows from the Preservation of Intervals 
Theorem 5.3.10 that J := f (T) is an interval. Moreover, since fis strictly increasing on J, it 
is injective on J; therefore the function g : J — R inverse to f exists. We claim that g is 
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Figure 5.6.3 g(y) 4x forye J 


strictly increasing. Indeed, if y;,y. E€ J with y; < yz, then y; = f(x,) and y, = f (x2) for 
some X1,X2 EI. We must have x; < x2; otherwise x; > x2, which implies that 
yı =f(x1) > f(x2) = y2, contrary to the hypothesis that y; < y). Therefore we have 
g(yı) = x1 < X2 = g(y2). Since yı and y, are arbitrary elements of J with y; < y2, we 
conclude that g is strictly increasing on J. 

It remains to show that g is continuous on J. However, this is a consequence of the fact 
that g(J) = I is an interval. Indeed, if g is discontinuous at a point c € J, then the jump of g 
at c is nonzero so that lim g < lim g. If we choose any number x Æ g(c) satisfying 


yc yrocet 


lim g<x< lim g, then x has the property that xÆ g(y) for any y €J. (See 


xo 


Figure 5.6.3.) Hence x ¢ J, which contradicts the fact that J is an interval. Therefore 
we conclude that g is continuous on J. QED. 


The nth Root Function 


We will apply the Continuous Inverse Theorem 5.6.5 to the nth power function. We need to 
distinguish two cases: (i) n even, and (ii) n odd. 


(i) n even. In order to obtain a function that is strictly monotone, we restrict our 
attention to the interval J := [0, 00). Thus, let f(x) := x” for x € I. (See Figure 5.6.4.) We 
have seen (in Exercise 2.1.23) that if 0 < x < y, then f(x) = x” < y” = f(y); therefore fis 
strictly increasing on I. Moreover, it follows from Example 5.2.3(a) that fis continuous on 
I. Therefore, by the Preservation of Intervals Theorem 5.3.10, J :=f(/) is an interval. 
We will show that J = [0, 00). Let y > 0 be arbitrary; by the Archimedean Property, there 
exists k € N such that 0 < y < k. Since 


f0) =0<y< k< k" = f(k), 


it follows from Bolzano’s Intermediate Value Theorem 5.3.7 that y € J. Since y > 0 is 
arbitrary, we deduce that J = [0, 00). 

We conclude from the Continuous Inverse Theorem 5.6.5 that the function g that is 
inverse to f(x) = x” on J = (0, œœ) is strictly increasing and continuous on J = [0, 00). We 
usually write 


g(x) =x" or g(x) = 4x 
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x x 


Figure 5.6.4 Graph of Figure 5.6.5 Graph of 
f(x) =x" (x > 0, neven) g(x) =x!” (x > 0, neven) 


for x > 0 (n even), and call x!” = x/x the nth root of x > 0 (n even). The function g is 
called the nth root function (n even). (See Figure 5.6.5.) 
Since g is inverse to f we have 


e(f(x)) =x and f(g(x))=x forall x € [0,co). 


We can write these equations in the following form: 
(x?) =x and (x =x 


for all x € [0,0o) and n even. 

(ii) n odd. In this case we let F(x) := x” for all x € R; by 5.2.3(a), F is continuous 
on R. We leave it to the reader to show that F is strictly increasing on R and that F(R) = R. 
(See Figure 5.6.6.) 

It follows from the Continuous Inverse Theorem 5.6.5 that the function G that is 
inverse to F(x) = x” for x € R, is strictly increasing and continuous on R. We usually 
write 


G(x) =x!” or G(x) = Wx forx €R, nodd, 


and call x!/” the nth root of x € R. The function G is called the nth root function (n odd). 
(See Figure 5.6.7.) Here we have 


(x7)1/" = and (x1) =x 
for all x € R and n odd. 
y y 
x x 
Figure 5.6.6 Graph of Figure 5.6.7 Graph of 


F(x) =x" (x € R, n odd) G(x) = x!” (x € R, n odd) 
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Rational Powers 


Now that the nth root functions have been defined for n € N, it is easy to define rational 
powers. 


5.6.6 Definition (i) If m,n €N and x > 0, we define x" = kea 
Gi) If m,n € N and x > 0, we define x™” := (Gr: 


Hence we have defined x” when r is a rational number and x > 0. The graphs of x +> x” 
depend on whether r > 1,r=1,0<r<1,r=0,orr < 0. (See Figure 5.6.8.) Since a 
rational number r € Q can be written in the form r = m/n with m € Z, n € N, in many 
ways, it should be shown that Definition 5.6.6 is not ambiguous. That is, if r = m/n = p/q 
with m, p € Zand n, q € Nand if x > 0, then (oy = (xay. We leave it as an exercise 
to the reader to establish this relation. 


5.6.7 Theorem If m € Z, n €N, and x > 0, then x"/" = (x")!", 


Proof. Ifx> 0 and m,n € Z, then (x”)" — ym — a”. Now lety = xin/n = ey" +0 


The reader should also show, as an exercise, that if x > 0 and r, s € Q, then 


xx = xi ts = x°x" and (x) = x5 = Y. 


1 
Figure 5.6.8 Graphs of x > x" (x> 0) 
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Exercises for Section 5.6 


10. 


11. 


12. 


13. 


14. 
15. 


If J := [a, b] is an interval and f : J — R is an increasing function, then the point a [respectively, 
b] is an absolute minimum [respectively, maximum] point for f on 7. If f is strictly increasing, 
then a is the only absolute minimum point for f on 1. 


If fand g are increasing functions on an interval J C R, show that f+ g is an increasing function 
on I. If f is also strictly increasing on J, then f + g is strictly increasing on I. 

Show that both f(x) := x and g(x) := x — 1 are strictly increasing on J := [0, 1], but that their 
product fg is not increasing on 1. 


Show that if f and g are positive increasing functions on an interval 7, then their product fg is 
increasing on 1. 


Show that if I := [a, b] and f : I — R is increasing on /, then fis continuous at a if and only if 
fla) = inff fŒ : x € (a, b}. 

Let J C R be an interval and let f : Z — R be increasing on J. Suppose that c € I is not an 
endpoint of I. Show that fis continuous at c if and only if there exists a sequence (x,,) in J such 
that x, < cforn=1,3,5,...;x, > c for n = 2, 4, 6, . . . ; and such that c = lim(x,,) and 
f© = lim Fn). 

Let J C R be an interval and let f : Z — R be increasing on 7. If c is not an endpoint of J, 
show that the jump jp(c) of f at c is given by inf{ f(y) — f(x): x <c<y, x,y € J}. 

Let f, g be strictly increasing on an interval J C R and let f(x) > g(x) for all x € Z. If 
y € f(D N g(1), show that f~! (y) < g7! (y). [Hint: First interpret this statement geometrically. ] 


Let J := [0, 1] and let f : Z > R be defined by f(x) := x for x rational, and f(x) := 1 — x for x 


irrational. Show that f is injective on Z and that f(f(x)) = x for all x € J. (Hence fis its own 


inverse function!) Show that f is continuous only at the point x = L. 


Let J := [a, b] and let f : J + R be continuous on J. If fhas an absolute maximum [respectively, 
minimum] at an interior point c of J, show that f is not injective on Z. 


Let f(x) := x for x € [0, 1], and f(x) := 1 + x for x € (1, 2]. Show that f and f~! are strictly 
increasing. Are f and f =! continuous at every point? 


Letf : [0, 1] — R be a continuous function that does not take on any of its values twice and with 
f) < fC). Show that fis strictly increasing on [0, 1]. 


Let A : [0, 1] — R be a function that takes on each of its values exactly twice. Show that h 
cannot be continuous at every point. [Hint: If c} < cy are the points where h attains its 
supremum, show that cı = 0, cy = 1. Now examine the points where / attains its infimum.] 


Let x € R, x > 0. Show that if m, p € Z, n, q € N, and mq = np, then (U = Gr 


If x € R, x > 0, and if r, s € Q, show that x"x*° = x"ts = xx" and (x")* = x" = (x°)’. 


CHAPTER 6 


DIFFERENTIATION 


Prior to the seventeenth century, a curve was generally described as a locus of points 
satisfying some geometric condition, and tangent lines were obtained through geometric 
construction. This viewpoint changed dramatically with the creation of analytic geometry in 
the 1630s by René Descartes (1596-1650) and Pierre de Fermat (1601-1665). In this new 
setting geometric problems were recast in terms of algebraic expressions, and new classes of 
curves were defined by algebraic rather than geometric conditions. The concept of derivative 
evolved in this new context. The problem of finding tangent lines and the seemingly unrelated 
problem of finding maximum or minimum values were first seen to have a connection by 
Fermat in the 1630s. And the relation between tangent lines to curves and the velocity of a 
moving particle was discovered in the late 1660s by Isaac Newton. Newton’s theory of 
“fluxions,” which was based on an intuitive idea of limit, would be familiar to any modern 
student of differential calculus once some changes in terminology and notation were made. 
But the vital observation, made by Newton and, independently, by Gottfried Leibniz in the 
1680s, was that areas under curves could be calculated by reversing the differentiation 
process. This exciting technique, one that solved previously difficult area problems with ease, 
sparked enormous interest among the mathematicians of the era and led to a coherent theory 
that became known as the differential and integral calculus. 


Isaac Newton 

Isaac Newton (1642-1727) was born in Woolsthorpe, in Lincolnshire, 
England, on Christmas Day; his father, a farmer, had died three months 
earlier. His mother remarried when he was three years old and he was sent 
to live with his grandmother. He returned to his mother at age eleven, only 
to be sent to boarding school in Grantham the next year. Fortunately, a 
perceptive teacher noticed his mathematical talent and, in 1661, Newton 
entered Trinity College at Cambridge University, where he studied with 
Isaac Barrow. 

When the bubonic plague struck in 1665-1666, leaving dead nearly 
70,000 persons in London, the university closed and Newton spent two 
years back in Woolsthorpe. It was during this period that he formulated his 
basic ideas concerning optics, gravitation, and his method of ‘‘fluxions,” later called “calculus.” 
He returned to Cambridge in 1667 and was appointed Lucasian Professor in 1669. His theories of 
universal gravitation and planetary motion were published to world acclaim in 1687 under the title 
Philosophies Naturalis Principia Mathematica. However, he neglected to publish his method of 
inverse tangents for finding areas and other work in calculus, and this led to a controversy over 
priority with Leibniz. 

Following an illness, he retired from Cambridge University and in 1696 was appointed 
Warden of the British mint. However, he maintained contact with advances in science and 
mathematics and served as President of the Royal Society from 1703 until his death in 1727. At 
his funeral, Newton was eulogized as “the greatest genius that ever existed.” His place of burial in 
Westminster Abbey is a popular tourist site. 
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In this chapter we will develop the theory of differentiation. Integration theory, 
including the fundamental theorem that relates differentiation and integration, will be the 
subject of the next chapter. We will assume that the reader is already familiar with the 
geometrical and physical interpretations of the derivative of a function as described in 
introductory calculus courses. Consequently, we will concentrate on the mathematical 
aspects of the derivative and not go into its applications in geometry, physics, economics, 
and so on. 

The first section is devoted to a presentation of the basic results concerning the 
differentiation of functions. In Section 6.2 we discuss the fundamental Mean Value 
Theorem and some of its applications. In Section 6.3 the important L’ Hospital Rules 
are presented for the calculation of certain types of ‘“‘indeterminate”’ limits. 

In Section 6.4 we give a brief discussion of Taylor’s Theorem and a few of its 
applications—for example, to convex functions and to Newton’s Method for the location 
of roots. 


Section 6.1 The Derivative 


In this section we will present some of the elementary properties of the derivative. We 
begin with the definition of the derivative of a function. 


6.1.1 Definition Let / C R be an interval, let f : J — R, and let c € J. We say that a real 
number L is the derivative of fat c if given any e > 0 there exists 6(e) > O such that if x € 7 
satisfies 0 < |x — c| < 6(e), then 


(1) 


xX—C 


AAU aiea 


In this case we say that f is differentiable at c, and we write f'(c) for L. 
In other words, the derivative of f at c is given by the limit 


. Jœ) -fo 
2 "(A = lim * 
( ) f (c) ane XE 
provided this limit exists. (We allow the possibility that c may be the endpoint of the 
interval.) 


Note Itis possible to define the derivative of a function having a domain more general 
than an interval (since the point c need only be an element of the domain and also a cluster 
point of the domain) but the significance of the concept is most naturally apparent for 
functions defined on intervals. Consequently we shall limit our attention to such functions. 


Whenever the derivative of f : I — R exists at a point c € J, its value is denoted by 
f'(c). In this way we obtain a function f’ whose domain is a subset of the domain of f. In 
working with the function f’, it is convenient to regard it also as a function of x. For 
example, if f(x) := x? for x € R, then at any c in R we have 

= gan?) 
Fiorano Roe, 


x—>c x-C x-c X—C x—>c 


Thus, in this case, the function f’ is defined on all of R and f'(x) = 2x for x€ R. 
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We now show that continuity of f at a point c is a necessary (but not sufficient) 
condition for the existence of the derivative at c. 


6.1.2 Theorem Jff:I— R has a derivative at c € I, then f is continuous at c. 


Proof. For all x € I, x 4c, we have 


fo) -to = (PEP) ao. 


xX—C 


Since f’(c) exists, we may apply Theorem 4.2.4 concerning the limit of a product to 
conclude that 


im (6) ~f(e)) = im (ELE) ( 


x-C xc x—C¢ 
= f'() -0 =0. 


Therefore, lim f(x) = f(c) so that f is continuous at c. Q.E.D. 


The continuity of f : I — R at a point does not assure the existence of the derivative at 
that point. For example, if f(x) :=|x| for xeR, then for x #0 we have 
(f(x) —f(0))/(x — 0) = |x|/x, which is equal to 1 if x > 0, and equal to —1 if x < 0. 
Thus the limit at 0 does not exist [see Example 4.1.10(b)], and therefore the function is not 
differentiable at 0. Hence, continuity at a point c is not a sufficient condition for the 
derivative to exist at c. 


Remark By taking simple algebraic combinations of functions of the form x> |x — c|, 
it is not difficult to construct continuous functions that do not have a derivative at a finite (or 
even a countable) number of points. In 1872, Karl Weierstrass astounded the mathematical 
world by giving an example of a function that is continuous at every point but whose 
derivative does not exist anywhere. Such a function defied geometric intuition about 
curves and tangent lines, and consequently spurred much deeper investigations into the 
concepts of real analysis. It can be shown that the function f defined by the series 


f(x) := > koos (3"x) 
n=0 


has the stated property. A very interesting historical discussion of this and other examples 
of continuous, nondifferentiable functions is given in Kline, pp. 955-966, and also in 
Hawkins, pp. 44—46. A detailed proof for a slightly different example can be found in 
Appendix E. 


There are a number of basic properties of the derivative that are very useful in the 
calculation of the derivatives of various combinations of functions. We now provide the 
justification of some of these properties, which will be familiar to the reader from earlier 
courses. 


6.1.3 Theorem LetI C R be an interval, let c € I, and let f : I — Rand g : I — R be 
functions that are differentiable at c. Then: 
(a) Ifa €R, then the function af is differentiable at c, and 


(3) (af)'(c) = af (o). 
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(b) The function f + g is differentiable at c, and 


(4) (f+) (6) =F (e) +g (o). 

(c) (Product Rule) The function fg is differentiable at c, and 

(5) (F8) le) =f'(c)a(c) +Fle)g (e). 

(d) (Quotient Rule) If g(c) £ 0, then the function f/g is differentiable at c, and 
fy Fogle) -Flogo 

6 = ; 

” e COF 


Proof. We shall prove (c) and (d), leaving (a) and (b) as exercises for the reader. 
(c) Let p := fg; then for x € I, x Æ c, we have 
P(x) = ple) _ F(x)g(x) — float) 


foals) = FO) +f a(x) — FOR) 
= f(x) — f(c) . g(x) +f(c) : g(x) = g(c) ; 


x—c x—-—C€ 
Since g is continuous at c, by Theorem 6.1.2, then lim g(x) = g(c). Since f and g are 
differentiable at c, we deduce from Theorem 4.2.4 on properties of limits that 
__ P(x) = p(o) 
ime ae = f'(c)g(c) +f (o)g'(c)- 
Hence p := fg is differentiable at c and (5) holds. 


(d) Let g:=f/g. Since g is differentiable at c, it is continuous at that point (by 
Theorem 6.1.2). Therefore, since g(c) 4 0, we know from Theorem 4.2.9 that there exists an 
interval J C J with c € J such that g(x) Æ 0 for all x € J. For x € J, x # c, we have 


q(x) = gle) _ f8) ~f(O/8(© _ Fele) — flax) 
x—c x-—C¢ g(x)g(c)(x — c) 
_ fxs) ~flo)s(O) +f(O8( ~ flO) 
godele) (x — c) 
1 _ [|œ -Fo g(x) = glo) 
Saon ae GOO a 
Using the continuity of g at c and the differentiability of f and g at c, we get 


He) = im 1 2410 LOO -AOLO 
HOS Re ame (go) 


Thus, q = f /g is differentiable at c and equation (6) holds. QED. 


Mathematical Induction may be used to obtain the following extensions of the 
differentiation rules. 


6.1.4 Corollary Iffi, fo,...,f, are functions on an interval I to R that are differen- 
tiable at c € I, then: 


(a) The function fı + fa +: +f, is differentiable at c and 
(7) (fithat-- +h) =O +h +---+F,0- 
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(b) The function fı fa: f, is differentiable at c, and 
(8) (fifo Fn) OA = FOSA) + Fale) + fil F2(0) fale) 
+o +F) fale) fale). 


An important special case of the extended product rule (8) occurs if the functions are 
equal, that is, fi = fa =---=f,, =f. Then (8) becomes 


(9) EVO =a FATO. 
In particular, if we take f(x) := x, then we find the derivative of g(x) := x” to be 


g'(x) = nx"! n € N. The formula is extended to include negative integers by applying 
the Quotient Rule 6.1.3(d). 


Notation If 7 C R is an interval and f : J — R, we have introduced the notation f’ to 
denote the function whose domain is a subset of Z and whose value at a point c is the derivative 
f'(c) of f at c. There are other notations that are sometimes used for f’, for example, one 
sometimes writes Df for f’. Thus one can write formulas (4) and (5) in the form: 


D(f + g) = Df + Dg, D( fg) = (Df) -g +f - (D8). 


When x is the “independent variable,” it is common practice in elementary courses to write 
df /dx for f'. Thus formula (5) is sometimes written in the form 


EE = (Eto) ) ats) +40) (09). 


This last notation, due to Leibniz, has certain advantages. However, it also has certain 
disadvantages and must be used with some care. 


The Chain Rule 


We now turn to the theorem on the differentiation of composite functions known as the 
“Chain Rule.” It provides a formula for finding the derivative of a composite function 
g o f in terms of the derivatives of g and f. 

We first establish the following theorem concerning the derivative of a function at a 
point that gives us a very nice method for proving the Chain Rule. It will also be used to 
derive the formula for differentiating inverse functions. 


6.1.5 Carathéodory’s Theorem Let f be defined on an interval I containing the point c. 
Then f is differentiable at c if and only if there exists a function ọ on I that is continuous at 
c and satisfies 


(10) f(x) —Fle) = olx)(x— c) for xer. 
In this case, we have (c) = f'(c). 


Proof. (=) If f'(c) exists, we can define y by 


rere La for x#c,xEl, 
fo) for x=c. 


The continuity of g follows from the fact that lim g(x) = f’(c). If x = c, then both sides of 
(10) equal 0, while if x 4 c, then multiplication of g(x) by x — c gives (10) forall other x € J. 
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(<) Now assume that a function ¢ that is continuous at c and satisfying (10) exists. If 
we divide (10) by x — c £ 0, then the continuity of g implies that 


exists. Therefore f is differentiable at c and f’(c) = (c). QED. 


To illustrate Carathéodory’s Theorem, we consider the function f defined by f(x) := x? 


for x € R. For c € R, we see from the factorization 
x= (x? + cx + ©?) (x — c) 
that y(x) := x? + cx +c’ satisfies the conditions of the theorem. Therefore, we conclude 
that f is differentiable at c € R and that f’(c) = g(c) = 3c’. 
We will now establish the Chain Rule. If fis differentiable at c and g is differentiable at 
f(c), then the Chain Rule states that the derivative of the composite function g o f at cis the 
product (g o f) (c) = g'(f(c)) -f’(c). Note that this can be written as 


(gof) = (8 o f) f. 
One approach to the Chain Rule is the observation that the difference quotient can be 
written, when f(x) Æ f(c), as the product 
8) = 8) 8) = E) FQ) FO) | 


xe Of) fo x—e 
This suggests the correct limiting value. Unfortunately, the first factor in the product on the 
right is undefined if the denominator f(x) — f(c) equals 0 for values of x near c, and this 
presents a problem. However, the use of Carathéodory’s Theorem neatly avoids this 
difficulty. 


6.1.6 Chain Rule Let J, J be intervals in R, let g : I — R and f : J — R be functions 
such that f(J) C I, and let c € J. Iff is differentiable at c and if g is differentiable at f(c), 
then the composite function g o f is differentiable at c and 


(11) (go PY (0) = 8'(F(2)) SO. 


Proof. Since f'(c) exists, Carathéodory’s Theorem 6.1.5 implies that there exists a 
function g on J such that gy is continuous at c and f(x) —f(c) = (x)(x — c) for 
x € J, and where g(c) = f’(c). Also, since g/(f(c)) exists, there is a function y defined 
on J such that w is continuous at d := f(c) and g(y) — g(d) = W(y)(y — d) for y € I, where 
w(d) = g'(d). Substitution of y = f(x) and d = f(c) then produces 


(F(x) = BF) = WFO) FQ) — FCO) = [Ch © FC) - POM — e) 


for all x € J such that 
value at c is g'(f(c)) - 


f(x) € I. Since the function (y o f) - g is continuous at c and its 
-f'(c), Carathéodory’s Theorem gives (11). QED. 

If g is differentiable on J, if fis differentiable on J, and if f(J) C J, then it follows from 
the Chain Rule that (g o f) = (g' o f)-f’, which can also be written in the form 


D(g o f) = (Dg o f): Df. 
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6.1.7 Examples (a) Iff :I— R is differentiable on J and g(y) := y” for y € R and 
n € N, then since g'(y) = ny""!, it follows from the Chain Rule 6.1.6 that 


(g o f)'(x) = e'(f(x))-f'(x) for x el. 


Therefore we have (f”)' (x) = n( f(x)) tf" (x) for all x € J as was seen in (9). 


(b) Suppose that f : J — R is differentiable on J and that f(x) 4 0 and f'(x) 4 0 for x € I. 
If h(y) := 1/y for y Æ 0, then it is an exercise to show that h'(y) = —1/y’ for y € R, y 40. 
Therefore we have 


1\' ! 1 1 f'(x) 
= =(hof =h(f(x))f =—-——, f el. 
(5) (x) = A o f) (x) = AFA) FO) Gan = 
(c) The absolute value function g(x) := |x| is differentiable at all x 4 0 and has derivative 


g'(x) = sgn(x) for x Æ 0. (The signum function is defined in Example 4.1.10(b).) Though 
sgn is defined everywhere, it is not equal to g’ at x = 0 since g’(0) does not exist. 

Now if f is a differentiable function, then the Chain Rule implies that the function 
g of = |f| is also differentiable at all points x where f(x) 40, and its derivative is 
given by 


M1) = sen) FO) = ee ie Soo co 


If f is differentiable at a point c with f(c) = 0, then it is an exercise to show that |f| is 
differentiable at c if and only if f’(c) = 0. (See Exercise 7.) 

For example, if f(x) := x? — 1 for x € R, then the derivative of its absolute value 
|f|(x) = [x — 1| is equal to |f| (x) = sgn(x? — 1) - (2x) for x # 1, —1. See Figure 6.1.1 
for a graph of |f]. 


-2 -1 1 2 
Figure 6.1.1 The function |f|(x) = |x? — 1|. 


(d) It will be proved later that if S(x) := sin x and C(x) := cos x for all x € R, then 
S'(x) = cosx = C(x) and C'(x)= —sinx = —S(x) 
for all x € R. If we use these facts together with the definitions 


sin x 1 
tan x := sec x := j 
cos X 
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for x Æ (2k + 1)z/2, k € Z, and apply the Quotient Rule 6.1.3(d), we obtain 


(cos x)(cos x) — (sin x)(—sin x) 


Dtanx = 5 = (sec x)’, 
(cos x) 
0-— 1(-si i 
Dsecx = ( = 2 = 5 = (sec x)(tan x) 
(cos x) (cos x) 
for x Æ (2k + 1)n/2,k € Z. 
Similarly, since 
cos 
cot x := ——, csc x := —— 
sin sin x 
for x Æ kx, k € Z, then we obtain 
D cot x = —(cse x)’ and Desc x = — (csc x)(cot x) 


for x# knr, k eZ. 
(e) Suppose that f is defined by 


a f xsin (1/x) for x #0, 
fay] for x=0. 


If we use the fact that D sin x = cos x for all x € R and apply the Product Rule 6.1.3(c) and 
the Chain Rule 6.1.6, we obtain (why?) 


f'(x) = 2xsin(1/x) —cos(1/x) for x #40. 


If x = 0, none of the calculational rules may be applied. (Why?) Consequently, the 
derivative of fat x = 0 must be found by applying the definition of derivative. We find that 


f'(0) = limt ®© — f (0) — lim xsin (1/x) 
0 


= lim xsin (1/x) = 0. 
x0 x= x30 x x0 
Hence, the derivative f’ of f exists at all x € R. However, the function f’ does not have a 
limit at x = 0 (why?), and consequently f’ is discontinuous at x = 0. Thus, a function f that 
is differentiable at every point of R need not have a continuous derivative f". 


Inverse Functions 


We will now relate the derivative of a function to the derivative of its inverse function, when 
this inverse function exists. We will limit our attention to a continuous strictly monotone 
function and use the Continuous Inverse Theorem 5.6.5 to ensure the existence of a 
continuous inverse function. 

If fis a continuous strictly monotone function on an interval /, then its inverse function 
g =f! is defined on the interval J := f(T) and satisfies the relation 


g(f(x))=x for xel. 


If c € I and d := f(c), and if we knew that both f’(c) and g'(d) exist, then we could 
differentiate both sides of the equation and apply the Chain Rule to the left side to get 
g'(f(c)) -f'(c) = 1. Thus, if f(c) 4 0, we would obtain 
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However, it is necessary to deduce the differentiability of the inverse function g from the 
assumed differentiability of f before such a calculation can be performed. This is nicely 
accomplished by using Carathéodory’s Theorem. 


6.1.8 Theorem Let I be an interval in R and let f : I — R be strictly monotone and 
continuous on I. Let J := f (I) and let g: J — R be the strictly monotone and continuous 
function inverse to f. If f is differentiable at c € I and f' (c) £ 0, then g is differentiable at 
d := f(c) and 


(12) (=a en 


Proof. Given c € R, we obtain from Carathéodory’s Theorem 6.1.5 a function g on J with 
properties that is continuous at c, f(x) — f(c) = (x)(x — c) for x € I, and g(c) = f'(c). 
Since (c) 4 0 by hypothesis, there exists a neighborhood V := (c — 6, c + ô) such that 
g(x) #0 for all x € VAZ. (See Theorem 4.2.9.) If U :=f(V ATI), then the inverse 
function g satisfies f(g(y)) = y for all y € U, so that 


y—d=f(g(y)) —f() = 98): (80) — 8(@)). 
Since v(g(y)) 4 0 for y € U, we can divide to get 


1 
8\Y) — 8&4) = oy 
iar) a 
Since the function 1/( © g) is continuous at d, we apply Theorem 6.1.5 to conclude that 
g'(d) exists and g'(d) = 1/(g(d)) = 1/g(c) = 1/f'(c). QED. 


Note The hypothesis, made in Theorem 6.1.8, that f’(c) 4 0 is essential. In fact, if 
f'(c) = 0, then the inverse function g is never differentiable at d = f(c), since the assumed 
existence of g'(d) would lead to 1 = f’(c)g’(d) = 0, which is impossible. The function 
f(x) := x? with c = 0 is such an example. 


6.1.9 Theorem Let I be an interval and let f : I — R be strictly monotone on I. Let 
J :=f(I) and let g:J — R be the function inverse to f. If f is differentiable on I and 
f'(x) £0 for x EI, then g is differentiable on J and 


1 
bes 
(13) irre 
Proof. If fis differentiable on /, then Theorem 6.1.2 implies that fis continuous on Z, and 
by the Continuous Inverse Theorem 5.6.5, the inverse function g is continuous on J. 
Equation (13) now follows from Theorem 6.1.8. Q.E.D. 


Remark Iffand g are the functions of Theorem 6.1.9, and if x € J and y € J are related 
by y = f(x) and x = g(y), then equation (13) can be written in the form 
1 1 
> y J, or (g o f(x) = z xe. 
F 0 8)(y) 


It can also be written in the form g’(y) = 1/f’(x), provided that it is kept in mind that x and 
y are related by y = f(x) and x = g(y). 


g(y)= 
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6.1.10 Examples (a) The function f : R — R defined by f(x) := x° + 4x + 3 is con- 
tinuous and strictly monotone increasing (since it is the sum of two strictly increasing 
functions). Moreover, f'(x) = 5x* + 4 is never zero. Therefore, by Theorem 6.1.8, the 
inverse function g = f~! is differentiable at every point. If we take c = 1, then since 
f(1) =8, we obtain g'(8) = g'(f(1)) = 1/f'(1) = 1/9. 

(b) Letn € N be even, let J := [0, 00), and let f(x) := x” for x € I. It was seen at the end 
of Section 5.6 that fis strictly increasing and continuous on J, so that its inverse function 
g(y) := y!” for y € J := [0, 00) is also strictly increasing and continuous on J. Moreover, 
we have f'(x) = nx"! for all x € I. Hence it follows that if y > 0, then g'(y) exists and 


1 1 1 
~ f'(gQ)) i n(g(y))" | 7 ny"=1)/n ` 


Hence we deduce that 


1 
g'o) =y fo y>0. 


However, g is not differentiable at 0. (For a graph of f and g, see Figures 5.6.4 and 5.6.5.) 
(c) Letn € N, n £ 1, be odd, let F(x) := x” for x € R, and let G(y) := y!” be its inverse 
function defined for all y € R. As in part (b) we find that G is differentiable for y 4 0 and 
that G' (y) = (1/n)y"/"—! for y # 0. However, G is not differentiable at 0, even though G is 
differentiable for all y Æ 0. (For a graph of F and G, see Figures 5.6.6 and 5.6.7.) 

(d) Let r:= m/n be a positive rational number, let J := [0, 00), and let R(x) := x" for 
x € I. (Recall Definition 5.6.6.) Then R is the composition of the functions f(x) := x” and 
g(x) := x!/", x € I. That is, R(x) = f(g(x)) for x € I. If we apply the Chain Rule 6.1.6 
and the results of (b) [or (c), depending on whether n is even or odd], then we obtain 


R(x) = F (a(x) a! (x) = m(x ct 


m/n)—1 r—1 


m 
= — xl =1rx 


n 

for all x > 0. If r > 1, then it is an exercise to show that the derivative also exists at x = 0 
and R’(0) = 0. (For a graph of R see Figure 5.6.8.) 

(e) The sine function is strictly increasing on the interval J := [—7/2, 2/2]; therefore its 
inverse function, which we will denote by Arcsin, exists on J := [—1, 1]. That is, if x € 
[—/2, 2/2] and y € [—1, 1] then y = sin x if and only if Arcsin y = x. It was asserted 
(without proof) in Example 6.1.7(d) that sin is differentiable on J and that D sin x = cos x 
for x € I. Since cos x # 0 for x in (—2/2, 1/2) it follows from Theorem 6.1.8 that 


1 1 


D Arcsin y = - = 
Dsinx cosx 


= 1 = 1 
yl — (sin x)? iar 


for all y € (—1, 1). The derivative of Arcsin does not exist at the points —1 and 1. 


Exercises for Section 6.1 


1. Use the definition to find the derivative of each of the following functions: 
(a) f(x) := x forx ER, (b) g(x) := 1/xforx € R, x £0, 
(c) A(x) := yx for x > 0, (d) k(x) := 1/yx for x > 0. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 
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Show that f(x) := x'/3, x € R, is not differentiable at x = 0. 
Prove Theorem 6.1.3(a), (b). 


Let f : R — R be defined by f(x) := x? for x rational, f(x) := 0 for x irrational. Show that f is 
differentiable at x = 0, and find f’(0). 


Differentiate and simplify: 
x 
(a) f(x) "= eye (b) g(x) = VS — 2x + x?, 
(c) h(x) := (sinx*)” form, k € N, (d) k(x) := tan (x°) for |x| < 4/7/2. 


Let n € N and let f : R — R be defined by f(x) := x” for x > 0 and f(x) := 0 for x < 0. For 
which values of n is f’ continuous at 0? For which values of n is f” differentiable at 0? 


Suppose that f : R — R is differentiable at c and that f(c) = 0. Show that g(x) := |f(x)| is 
differentiable at c if and only if f’(c) = 0. 


Determine where each of the following functions from R to R is differentiable and find the 
derivative: 

@ f(x) = |x] +|x+ I], (b) g(x) = 2x + |x], 

(c) h(x) := x|x], (d) k(x) := |sin x|. 

Prove that if f : R — R is an even function [that is, f(—x) = f(x) for all x € R] and has a 
derivative at every point, then the derivative f” is an odd function [that is, f’(—x) = —f’(x) for 


all x € R]. Also prove that if g : R — R is a differentiable odd function, then g’ is an even 
function. 


Let g : R — R be defined by g(x) := x’sin(1/x”) for x Æ 0, and g(0) := 0. Show that g is 
differentiable for all x € R. Also show that the derivative g’ is not bounded on the interval 
[-1, 1]. 


Assume that there exists a function L : (0,00) — R such that L’(x) = 1/x for x > 0. Calculate 
the derivatives of the following functions: 
(a) f(x) := L(2x + 3) for x > 0, (b) g(x) := (LB? for x > 0, 


(c) A(x) := L(ax) fora > 0, x > 0, (d) k(x) := L(L(x)) when L(x) > 0, x > 0. 


If r > 0 is a rational number, let f : R — R be defined by f(x) := x"sin (1/x) for x Æ 0, and 
f(0) := 0. Determine those values of r for which f’(0) exists. 


If f : R — R is differentiable at c € R, show that 


F'(e) = lim(n{ f(c + 1/n) — f(o)})- 


However, show by example that the existence of the limit of this sequence does not imply the 
existence of f’(c). 


Given that the function A(x) := x3 + 2x + 1 for x € R has an inverse h~! on R, find the value of 
1 . . 
(h-') (y) at the points corresponding to x = 0, 1, —1. 


Given that the restriction of the cosine function cos to J := [0, 7] is strictly decreasing and that 
cos0 = 1, cosa = —1, let J := [—1, 1], and let Arccos: J — R be the function inverse to the 
restriction of cos to J. Show that Arccos is differentiable on (—1, 1) and D Acrccos y = 
(-1)/ - yy? for y € (—1, 1). Show that Arccos is not differentiable at —1 and 1. 


Given that the restriction of the tangent function tan to J := (—z/2, 2/2) is strictly increasing 
and that tan (7) = R, let Arctan: R — R be the function inverse to the restriction of tan to 7. 
Show that Arctan is differentiable on R and that D Arctan(y) = (1+ y?)~! for y € R. 


Let f : I — R be differentiable at c € J. Establish the Straddle Lemma: Given £ > 0 there 
exists 5(¢) > 0 such that if u, v € I satisfy c — d(e) < u < c < v < c + ô(e), then we have 
|f(v) — f(u) — (v — u) f'(c)| < e(v — u). [Hint: The 8(e) is given by Definition 6.1.1. Subtract 
and add the term f(c) — cf'(c) on the left side and use the Triangle Inequality.] 


172 CHAPTER 6 DIFFERENTIATION 


Section 6.2 The Mean Value Theorem 


The Mean Value Theorem, which relates the values of a function to values of its derivative, 
is one of the most useful results in real analysis. In this section we will establish this 
important theorem and sample some of its many consequences. 

We begin by looking at the relationship between the relative extrema of a function and 
the values of its derivative. Recall that the function f : J — R is said to have a relative 
maximum [respectively, relative minimum] at c € / if there exists a neighborhood 
V := V4(c) of c such that f(x) < f(c) [respectively, f(c) < f(x)] for all x in VOT. We 
say that fhas a relative extremum at c € / if it has either a relative maximum or a relative 
minimum at c. 

The next result provides the theoretical justification for the familiar process of finding 
points at which fhas relative extrema by examining the zeros of the derivative. However, it 
must be realized that this procedure applies only to interior points of the interval. For 
example, if f(x) := x on the interval J := [0, 1], then the endpoint x = 0 yields the unique 
relative minimum and the endpoint x = | yields the unique maximum of fon Z, but neither 
point is a zero of the derivative of f. 


6.2.1 Interior Extremum Theorem Let c be an interior point of the interval I at which 
f:1— R has a relative extremum. If the derivative of f at c exists, then f'(c) = 0. 


Proof. We will prove the result only for the case that f has a relative maximum at c; the 
proof for the case of a relative minimum is similar. 
If f’(c) > 0, then by Theorem 4.2.9 there exists a neighborhood V C 7 of c such that 


f(x) -fio 
x-¢ 
If x€ V and x > c, then we have 


>0 for xEV, xe. 


TORG 


P a 


>0. 


f(x) -f= -e 


But this contradicts the hypothesis that f has a relative maximum at c. Thus we cannot 
have f’(c) > 0. Similarly (how?), we cannot have f'(c) < 0. Therefore we must have 
f'(c) =0. QED. 


6.2.2 Corollary Let f : I — R be continuous on an interval I and suppose that f has a 
relative extremum at an interior point c of I. Then either the derivative of f at c does not 
exist, or it is equal to zero. 


We note that if f(x) := |x| on J := [—1, 1], then fhas an interior minimum at x = 0; 
however, the derivative of f fails to exist at x = 0. 


6.2.3 Rolle’s Theorem Suppose that fis continuous on a closed interval I := |a, b], that 
the derivative f' exists at every point of the open interval (a, b), and that f(a) = f(b) = 0. 
Then there exists at least one point c in (a, b) such that f'(c) = 0. 


Proof. If f vanishes identically on J, then any c in (a, b) will satisfy the conclusion of the 
theorem. Hence we suppose that f does not vanish identically; replacing f by —f if 
necessary, we may suppose that f assumes some positive values. By the Maximum- 
Minimum Theorem 5.3.4, the function f attains the value sup{ f(x) : x € I} > 0 at some 
point c in J. Since f(a) = f(b) = 0, the point c must lie in (a, b); therefore f’(c) exists. 
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fic) =0 


Figure 6.2.1 Rolle’s Theorem 


Since f has a relative maximum at c, we conclude from the Interior Extremum 
Theorem 6.2.1 that f’(c) = 0. (See Figure 6.2.1.) QED. 


As a consequence of Rolle’s Theorem, we obtain the fundamental Mean Value 
Theorem. 


6.2.4 Mean Value Theorem Suppose that f is continuous on a closed interval 
I := |a, b], and that f has a derivative in the open interval (a, b). Then there exists at 
least one point c in (a, b) such that 


[The function g is simply the difference of f and the function whose graph is the line 
segment joining the points (a, f(a)) and (b, f(6)); see Figure 6.2.2.] The hypotheses of 


a x c b 
Figure 6.2.2 The Mean Value Theorem 
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Rolle’s Theorem are satisfied by ¢ since g is continuous on [a, b], differentiable on (a, b), 
and g(a) = (b) = 0. Therefore, there exists a point c in (a, b) such that 


o= glo) =f Oo 


Hence, f(b) — f(a) = f'(c)(b— a). QED. 


Remark The geometric view of the Mean Value Theorem is that there is some point on 
the curve y = f(x) at which the tangent line is parallel to the line segment through the 
points (a, f(a)) and (b, f(b)). Thus it is easy to remember the statement of the Mean Value 
Theorem by drawing appropriate diagrams. While this should not be discouraged, it tends 
to suggest that its importance is geometrical in nature, which is quite misleading. In fact the 
Mean Value Theorem is a wolf in sheep’s clothing and is the Fundamental Theorem of 
Differential Calculus. In the remainder of this section, we will present some of the 
consequences of this result. Other applications will be given later. 


The Mean Value Theorem permits one to draw conclusions about the nature of a 
function f from information about its derivative f’. The following results are obtained in 
this manner. 


6.2.5 Theorem Suppose that f is continuous on the closed interval I := |a, b], that f is 
differentiable on the open interval (a, b), and that f'(x) = 0 for x € (a,b). Then f is 
constant on I. 


Proof. We will show that f(x) = f(a) for all x € 7. Indeed, if x € J, x > a, is given, 
we apply the Mean Value Theorem to f on the closed interval [a, x]. We obtain a 
point c (depending on x) between a and x such that f(x) — f(a) = f'(c)(x — a). Since 
f'(c) =0 (by hypothesis), we deduce that f(x) — f(a) = 0. Hence, f(x) =f(a) for 
any x El. Q.E.D. 


6.2.6 Corollary Suppose that f and g are continuous on I := [a,b], that they are 
differentiable on (a, b), and that f'(x) = g'(x) for all x € (a,b). Then there exists a 
constant C such that f = g+ C onl. 


Recall that a function f : I — R is said to be increasing on the interval / if whenever 
x1, X2 in I satisfy xı < x2, then f(x1) < f(x2). Also recall that f is decreasing on / if the 
function — f is increasing on J. 


6.2.7 Theorem Let f : I — R be differentiable on the interval I. Then: 
(a) fis increasing on I if and only if f'(x) > 0 for all x € I. 
(b) fis decreasing on I if and only if f'(x) < 0 for all x € I. 


Proof. (a) Suppose that fx) > 0 for all x € Z. If xı, x2 in I satisfy xı < x2, then we 
apply the Mean Value Theorem to fon the closed interval J := [x,, x2] to obtain a point c in 
(x1, X2) such that 


F2) — (21) =F (6) (x2 — x1). 
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Since f'(c) > 0 and x2 — xı > 0, it follows that f(x.) —f(x;) > 0. (Why?) Hence, 
f(x1) <f(x2) and, since x; < x2 are arbitrary points in J, we conclude that f is 
increasing on J. 

For the converse assertion, we suppose that f is differentiable and increasing on J. 
Thus, for any point x Æ c in J, we have (f(x) —f(c))/(x — c) > 0. (Why?) Hence, by 
Theorem 4.2.6 we conclude that 

f'(c) z limf ®© — f(c) >0. 


XC x-c 


(b) The proof of part (b) is similar and will be omitted. Q.E.D. 


A function fis said to be strictly increasing on an interval /if for any points x1, x2 in I 
such that x; < x2, we have f(x,) < (x2). An argument along the same lines of the proof 
of Theorem 6.2.7 can be made to show that a function having a strictly positive derivative 
on an interval is strictly increasing there. (See Exercise 13.) However, the converse 
assertion is not true, since a strictly increasing differentiable function may have a derivative 
that vanishes at certain points. For example, the function f : R — R defined by f(x) := x? 
is strictly increasing on R, but f’(0) = 0. The situation for strictly decreasing functions is 
similar. 


Remark It is reasonable to define a function to be increasing at a point if there is a 
neighborhood of the point on which the function is increasing. One might suppose that, if 
the derivative is strictly positive at a point, then the function is increasing at this point. 
However, this supposition is false; indeed, the differentiable function defined by 


_ f x+2x?sin (1/x) if x0, 
a(x) A if x=0, 


is such that g’(0) = 1, yet it can be shown that g is not increasing in any neighborhood of 
x = 0. (See Exercise 10.) 


We next obtain a sufficient condition for a function to have a relative extremum at an 
interior point of an interval. 


6.2.8 First Derivative Test for Extrema Let f be continuous on the interval I := [a, b| 
and let c be an interior point of I. Assume that f is differentiable on (a, c) and (c, b) . Then: 


(a) If there is a neighborhood (c — 6, c + 8) C I such that f'(x) > 0 for c— ô< x< c 
and f'(x) <0 fore < x< c+ ô, then f has a relative maximum at c. 


(b) If there is a neighborhood (c — 8, c + 8) C I such that f'(x) <0 forc-8<x<c 
and f'(x) > 0 for c < x < c + ô, then f has a relative minimum at c. 


Proof. (a) If x € (c — ô, c), then it follows from the Mean Value Theorem that there 
exists a point cx € (x, c) such that f(c) — f(x) = (e — x) f' (cx). Since f'(cx) > 0 we infer 
that f(x) <f(c) for x € (c— ô, c). Similarly, it follows (how?) that f(x) < f(c) 
for x € (c, c+ ô). Therefore f(x) < f(c) for all x € (c — 6, c + ô) so that f has a relative 
maximum at c. 


(b) The proof is similar. Q.E.D. 


Remark The converse of the First Derivative Test 6.2.8 is not true. For example, there 
exists a differentiable function f : R — R with absolute minimum at x = 0 but such that 
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f’ takes on both positive and negative values on both sides of (and arbitrarily close to) 
x = 0. (See Exercise 9.) 


Further Applications of the Mean Value Theorem 


We will continue giving other types of applications of the Mean Value Theorem; in doing 
so we will draw more freely than before on the past experience of the reader and his or her 
knowledge concerning the derivatives of certain well-known functions. 


6.2.9 Examples (a) Rolle’s Theorem can be used for the location of roots of a function. 
For, if a function g can be identified as the derivative of a function f, then between any two 
roots of f there is at least one root of g. For example, let g(x) := cos x, then g is known to be 
the derivative of f(x) := sin x. Hence, between any two roots of sin x there is at least one 
root of cos x. On the other hand, g'(x) = —sin x = —f(x), so another application of Rolle’s 
Theorem tells us that between any two roots of cos there is at least one root of sin. 
Therefore, we conclude that the roots of sin and cos interlace each other. This conclusion is 
probably not news to the reader; however, the same type of argument can be applied to the 
Bessel functions J, of order n = 0, 1, 2,... by using the relations 


EION = x°In_a(x), EIRO =—x-"Jngi(x) for «x > 0. 


The details of this argument should be supplied by the reader. 


(b) We can apply the Mean Value Theorem for approximate calculations and to obtain 
error estimates. For example, suppose it is desired to evaluate v 105. We employ the Mean 
Value Theorem with f(x) := yx, a = 100, b = 105, to obtain 


5 
v105 — v 100 = —- 
2c 
for some number c with 100 < c < 105. Since 10 < ye < V105 < V/121 = 11, we can 
assert that 


5 5 
—— < v 105 — 10 < —, 
2(11) v105 2(10) 
whence it follows that 10.2272 < v 105 < 10.2500. This estimate may not be as sharp as 
desired. It is clear that the estimate /c < V105 < V121 was wasteful and can be 
improved by making use of our conclusion that v 105 < 10.2500. Thus, \/c < 10.2500 
and we easily determine that 


5 
2439 < ——-___ < VIOS — 10. 
oe? enaa Sa 


Our improved estimate is 10.2439 < v 105 < 10.2500. 


Inequalities 


One very important use of the Mean Value Theorem is to obtain certain inequalities. 
Whenever information concerning the range of the derivative of a function is available, this 
information can be used to deduce certain properties of the function itself. The following 
examples illustrate the valuable role that the Mean Value Theorem plays in this respect. 
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6.2.10 Examples (a) The exponential function f(x) := e* has the derivative f'(x) = e* 
for all x € R. Thus f'(x) > 1 for x > 0, and f'(x) < 1 for x < 0. From these relationships, 
we will derive the inequality 


(1) e~>1+x for xER, 


with equality occurring if and only if x = 0. 

If x = 0, we have equality with both sides equal to 1. If x > 0, we apply the Mean 
Value Theorem to the function f on the interval [0, x]. Then for some c with 0 < c < x 
we have 


e — e =e (x—0). 


Since e® = 1 and e° > 1, this becomes e* — 1 > x so that we have e* > 1 + x for x > 0. A 
similar argument establishes the same strict inequality for x < 0. Thus the inequality (1) 
holds for all x, and equality occurs only if x = 0. 

(b) The function g(x) := sin x has the derivative g'(x) = cos x for all x € R. On the basis 
of the fact that —1 < cos x < 1 for all x € R, we will show that 


(2) —x<sinx<x forall x> 0. 


Indeed, if we apply the Mean Value Theorem to g on the interval [0, x], where x > 0, we 
obtain 


sin x — sin 0 = (cosc)(x — 0) 


for some c between 0 and x. Since sin 0 = O and —1 < cosc < 1, we have —x < sin x < x. 
Since equality holds at x = 0, the inequality (2) is established. 


(c) (Bernoulli’s inequality) If œ > 1, then 
(3) (+x) >1+ax forall x> -—l1, 


with equality if and only if x = 0. 

This inequality was established earlier, in Example 2.1.13(c), for positive integer 
values of a by using Mathematical Induction. We now derive the more general version by 
employing the Mean Value Theorem. 

If h(x) := (1 +x)” then K(x) = a(1+.x)°! for all x > —1. [For rational œ this 
derivative was established in Example 6.1.10(c). The extension to irrational will be 
discussed in Section 8.3.] If x > 0, we infer from the Mean Value Theorem applied to 
hon the interval [0, x] that there exists c withO < ¢ < xsuchthath(x) — h(0) = h'(c)(x — 0). 
Thus, we have 


(1+x)*-1=a(1+0)%"'x. 

Since c > 0 and « — 1 > 0, it follows that (1+ c)*' > 1 and hence that (1 + x)* > 
1+ax.If—1< x < 0,asimilar use of the Mean Value Theorem on the interval [x, 0] leads 
to the same strict inequality. Since the case x = 0 results in equality, we conclude that (3) is 
valid for all x > —1 with equality if and only if x = 0. 

(d) Let a be a real number satisfying 0 < a < 1 and let g(x) = ax — x“ for x > 0. Then 
g'(x) =a(1 — x71), so that g'(x) < 0 for 0 < x< 1 and g'(x) > 0 for x > 1. Conse- 
quently, if x > 0, then g(x) > g(1) and g(x) = g(1) if and only if x = 1. Therefore, if 
x >0 and 0 <a < 1, then we have 


x <ax+(1—a). 
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If a > 0 and b > 0 and if we let x = a/b and multiply by b, we obtain the inequality 


ab'™ <aa+(1—a)b, 
where equality holds if and only if a = b. 


The Intermediate Value Property of Derivatives 


We conclude this section with an interesting result, often referred to as Darboux’s Theorem. It 
states that if a function is differentiable at every point of an interval J, then the function f’ has 
the Intermediate Value Property. This means that if f’ takes on values A and B, then it also 
takes on all values between A and B. The reader will recognize this property as one of the 
important consequences of continuity as established in Theorem 5.3.7. It is remarkable that 
derivatives, which need not be continuous functions, also possess this property. 


6.2.11 Lemma _ Let] C R bean interval, let f : I — R, let c € I, and assume that f has a 
derivative at c. Then: 

(a) Iff'(c) > 0, then there is a number 8 > 0. such that f(x) > f(c) for x € I such that 
e<x<ct+s. 

(b) If f'(c) <0, then there is a number 8 > 0 such that f(x) > f(c) for x € I such that 
c-8<xK<e. 


Proof. (a) Since 
mf) £0 
xe =2X—C 


it follows from Theorem 4.2.9 that there is a number 6 > 0 such that if x € Z and 
0 < |x —c| < ô, then 


=f'(c) > 0, 


f(x) =f 


X= E 


>0. 


If x € J also satisfies x > c, then we have 


f(x) -fio 


x—C 


> 0. 


f(x) Fle) = (x— 6) - 


Hence, if x € J and c < x < c + ô, then f(x) > f(c). 
The proof of (b) is similar. Q.E.D. 


6.2.12 Darboux’s Theorem Jff is differentiable on I = |a,b] and if k is a number 
between f'(a) and f'(b), then there is at least one point c in (a, b) such that f'(c) = k. 


Proof. Suppose that f'(a) < k < f'(b). We define g on I by g(x) := kx — f(x) for x € 1. 
Since g is continuous, it attains a maximum value on Z. Since g'(a) = k — f'(a) > 0, it 
follows from Lemma 6.2.11(a) that the maximum of g does not occur at x = a. Similarly, 
since g'(b) = k — f'(b) < 0, it follows from Lemma 6.2.11(b) that the maximum does not 
occur at x = b. Therefore, g attains its maximum at some c in (a, b). Then from Theorem 
6.2.1 we have 0 = g/(c) = k — f' (c). Hence, f’(c) = k. Q.E.D. 


6.2.13 Example The function g: [—1, 1] — R defined by 


1 for O<x< 1, 
g(x) := 0 for x=0, 


6.2 THE MEAN VALUE THEOREM 179 


(which is a restriction of the signum function) clearly fails to satisfy the intermediate value 
property on the interval [—1, 1]. Therefore, by Darboux’s Theorem, there does not exist a 
function f such that f'(x) = g(x) for all x € [—1, 1]. In other words, g is not the derivative 


on [—1, 1] of any function. 


Exercises for Section 6.2 


10. 


11. 


12. 


13. 


14. 


15. 


For each of the following functions on R to R, find points of relative extrema, the intervals on 
which the function is increasing, and those on which it is decreasing: 

(a) f(x) = x7 —3x+5, b) g(x) := 3x — 4x’, 

(c) A(x) = xX — 3x — 4, (d) k(x) := xt + 2x — 4. 


Find the points of relative extrema, the intervals on which the following functions are 
increasing, and those on which they are decreasing: 


(a) f(x) :=x+1/x forx 40, (b) g(x) :=x/(x2 +1) forx eR, 

(c) h(x) := /x-2Vx4+2 forx>0, (d) k(x) :=2x+1/x? forx 40. 

Find the points of relative extrema of the following functions on the specified domain: 

(a) f(x) = |x? -1|for -4< x< 4, b) g(x) := 1- (x-1) for0 < x <2, 
(c) h(x) = x|x? — 12| for — 2 < x < 3, (d) k(x) = x(x — 8)" for0 < x < 9. 
Let a1, a2,..., An be real numbers and let f be defined on R by 


n 


{x= Sai -x for xER. 


i=l 
Find the unique point of relative minimum for f. 
Let a > b > Oand let n € N satisfy n > 2. Prove that a!/” — b'/" < (a — b)'!". [Hint: Show that 
f(x) := x!/" — (x — 1)!” is decreasing for x > 1, and evaluate f at 1 and a/b.] 
Use the Mean Value Theorem to prove that |sin x — sin y| < |x — y| for all x, y in R. 
Use the Mean Value Theorem to prove that (x — 1)/x < Inx < x — 1 for x > 1. [Hint: Use the 
fact that D In x = 1/x for x > 0.] 
Letf : [a,b] — R be continuous on [a, b] and differentiable in (a, b). Show that if lim f'(x) =A, 
then f'(a) exists and equals A. [Hint: Use the definition of f'(a) and the Mean Value Theorem. ] 
Let f : R — R be defined by f(x) := 2x* + x‘ sin(1/x) for x 4 0 and f(0) := 0. Show that f 


has an absolute minimum at x = 0, but that its derivative has both positive and negative values in 
every neighborhood of 0. 


Let g : R > R be defined by g(x) := x + 2x? sin(1/x) for x 40 and g(0) := 0. Show that 


g'(0) = 1, but in every neighborhood of 0 the derivative g'(x) takes on both positive and 
negative values. Thus g is not monotonic in any neighborhood of 0. 


Give an example of a uniformly continuous function on [0,1] that is differentiable on (0, 1) but 
whose derivative is not bounded on (0, 1). 


If h(x) := 0 for x < Oand h(x) := 1 for x > 0, prove there does not exist a function f : R — R 
such that f'(x) = h(x) for all x€ R. Give examples of two functions, not differing by a 
constant, whose derivatives equal h(x) for all x Æ 0. 


Let I be an interval and let f : J — R be differentiable on 7. Show that if f’ is positive on /, then f 
is strictly increasing on /. 


Let / be an interval and let f : J — R be differentiable on J. Show that if the derivative f’ is never 
0 on J, then either f'(x) > 0 for all x € J or f'(x) < 0 for all x € Z. 


Let J be an interval. Prove that if fis differentiable on J and if the derivative f’ is bounded on J, 
then f satisfies a Lipschitz condition on J. (See Definition 5.4.4.) 
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16. 


17. 


18. 


19. 


20. 
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Let f : [0, o0) — R be differentiable on (0, oo) and assume that f'(x) — b as x > oo. 
(a) Show that for any h > 0, we have lim (f(x +h) —f(x))/h = b. 
xX—0O 

(b) Show that if f(x) — a as x — oo, then b = 0 
(c) Show that lim (f(x)/x) = b. 

x—0o 
Let f, g be differentiable on R and suppose that f(0) = g(0) and f'(x) < g'(x) for all x > 0. 
Show that f(x) < g(x) for all x > 0. 
Let J := [a,b] and let f : I — R be differentiable at c € J. Show that for every ¢ > 0 there exists 
ô > 0 such that if 0 < |x — y| < ô and a < x < c < y < b, then 


ae =f) ey a fa 


A differentiable function f : Z — R is said to be uniformly differentiable on J := [a, b] if for 

every € > 0 there exists ô > 0 such that if 0 < |x — y| < ô and x,y € Z, then 

f(x) -f0) 

Sey ry 
x—y 

Show that if f is uniformly differentiable on 7, then f’ is continuous on 7. 


<E. 


<E. 


Suppose that f : [0, 2] > R is continuous on [0, 2] and differentiable on (0, 2), and that 
fO) = 0, f0) = 1, f(2) =1. 

(a) Show that there exists cı € (0,1) such that f’(c1) = 

(b) Show that there exists cy € (1,2) such that f’(c2) = 

(c) Show that there exists c € (0,2) such that f’(c) = 1 i 3, 


Section 6.3 L’Hospital’s Rules 


In this section we will discuss limit theorems that involve cases that cannot be determined 
by previous limit theorems. For example, if f(x) and g(x) both approach 0 as x approaches 
a, then the quotient f(x)/g(x) may or may not have a limit at a and it is said to have the 
indeterminate form 0/0. The limit theorem for this case is due to Johann Bernoulli and first 
appeared in the 1696 book published by L’ Hospital. 


Johann Bernoulli 

Johann Bernoulli (1667—1748) was born in Basel, Switzerland. Johann 
worked for a year in his father’s spice business, but he was not a success. 
He enrolled in Basel University to study medicine, but his brother Jacob, 
twelve years older and a Professor of Mathematics, led him into mathematics. 
Together, they studied the papers of Leibniz on the new subject of calculus. 
Johann received his doctorate at Basel University and joined the faculty at 
Groningen in Holland, but upon Jacob’s death in 1705, he returned to Basel 
and was awarded Jacob’s chair in mathematics. Because of his many advances 
in the subject, Johann is regarded as one of the founders of calculus. 


While in Paris in 1692, Johann met the Marquis Guillame Francois de L’ Hospit and agreed 


to a financial arrangement under which he would teach the new calculus to L’Hospital, giving 
L’ Hospital the right to use Bernoulli’s lessons as he pleased. This was subsequently continued 
through a series of letters. In 1696, the first book on differential calculus, L’Analyse des Infiniment 
Petits, was published by L’Hospital. Though L’Hospital’s name was not on the title page, his 
portrait was on the frontispiece and the preface states “I am indebted to the clarifications of the 
brothers Bernoulli, especially the younger.” The book contains a theorem on limits later known as 
L’Hospital’s Rule although it was in fact discovered by Johann Bernoulli. In 1922, manuscripts 
were discovered that confirmed the book consisted mainly of Bernoulli’s lessons. And in 1955, 
the L’Hospital—Bernoulli correspondence was published in Germany. 
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The initial theorem was refined and extended, and the various results are collectively 
referred to as L Hospital’s (or L’ Hopital’s) Rules. In this section we establish the most basic 
of these results and indicate how others can be derived. 


Indeterminate Forms 


In the preceding chapters we have often been concerned with methods of evaluating limits. It 
was shown in Theorem 4.2.4(b) that if A := lim f(x) and B := lim g(x), and if B 4 0, then 


tim! *) = 2 ; 

wega) B 
However, if B = 0, then no conclusion was deduced. It will be seen in Exercise 2 that if 
B = 0 and A Æ 0, then the limit is infinite (when it exists). 

The case A = 0, B = 0 has not been covered previously. In this case, the limit of the 
quotient f/g is said to be “indeterminate.” We will see that in this case the limit may not 
exist or may be any real value, depending on the particular functions f and g. The 
symbolism 0/0 is used to refer to this situation. For example, if «œ is any real number, and if 
we define f(x) := ax and g(x) := x, then 


x0 g(x) x0 xX x0 


Thus the indeterminate form 0/0 can lead to any real number « as a limit. 

Other indeterminate forms are represented by the symbols co/oo, 0-0o, 
0°, 1°, 00°, and oo — oo. These notations correspond to the indicated limiting behavior 
and juxtaposition of the functions fand g. Our attention will be focused on the indeterminate 
forms 0/0 and 00/00. The other indeterminate cases are usually reduced to the form 0/0 or 
oo/oo by taking logarithms, exponentials, or algebraic manipulations. 


A Preliminary Result 


To show that the use of differentiation in this context is a natural and not surprising 
development, we first establish an elementary result that is based simply on the definition 
of the derivative. 


6.3.1 Theorem Let fand g be defined on [a, b], let f(a) = g(a) = 0, and let g(x) 4 0 for 
a<x <b. If f and g are differentiable at a and if g'(a) # 0, then the limit of f/g at a 
exists and is equal to f'(a)/g'(a). Thus 


fo) _f@ 


im ; 

x>a+g(x) g’(a) 
Proof. Since f(a) = g(a) =0, we can write the quotient f(x)/g(x) fora<x<b 
as follows: 


x—a 
Applying Theorem 4.2.4(b), we obtain 
in LOAA 
: x) xa} x-a fla 
weg). BO) — ala) g(a) QED 
& lim Ê & 8 ee 
X—a-- xX-—-a 
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Warning The hypothesis that f(a) = g(a) = 0 is essential here. For example, if f(x) 
:= x + 17 and g(x) := 2x +3 for x € R, then 
fix) 17 (0) 1 


lim —— = — hil = 
agao 3? Ve BOY 2 


The preceding result enables us to deal with limits such as 


x? +x 2-041 1 


1 ‘ 
p sin 2x 2cos0 2 


To handle limits where f and g are not differentiable at the point a, we need a more general 
version of the Mean Value Theorem due to Cauchy. 


6.3.2 Cauchy Mean Value Theorem Let f and g be continuous on [a, b] and 
differentiable on (a, b), and assume that g'(x) # 0 for all x in (a, b). Then there exists 
c in (a, b) such that 


Proof. As in the proof of the Mean Value Theorem, we introduce a function to which 
Rolle’s Theorem will apply. First we note that since g'(x) 4 0 for all x in (a, b), it follows 
from Rolle’s Theorem that g(a) # g(b). For x in [a, b], we now define 


ORIORI ch ake 
ha) =F ata) HO) ~ 8) = (Fl) -Aa 


Then A is continuous on [a, b], differentiable on (a, b), and h(a) = h(b) = 0. Therefore, it 
follows from Rolle’s Theorem 6.2.3 that there exists a point c in (a, b) such that 


1 f(b) - f(a) 1 4 
= h CNS Cc Cc). 
o =F me -ro 
Since g’(c) #0, we obtain the desired result by dividing by g’(c). QED. 


Remarks The preceding theorem has a geometric interpretation that is similar to that of 
the Mean Value Theorem 6.2.4. The functions fand g can be viewed as determining a curve 
in the plane by means of the parametric equations x = f(t), y= g(t) where a < t < b. 
Then the conclusion of the theorem is that there exists a point (f(c), g(c)) on the curve for 
some c in (a, b) such that the slope g’(c) /f’(c) of the line tangent to the curve at that point is 
equal to the slope of the line segment joining the endpoints of the curve. 

Note that if g(x) = x, then the Cauchy Mean Value Theorem reduces to the Mean 
Value Theorem 6.2.4. 


L’Hospital’s Rule, I 


We will now establish the first of L’Hospital’s Rules. For convenience, we will consider 
right-hand limits at a point a; left-hand limits, and two-sided limits are treated in exactly 
the same way. In fact, the theorem even allows the possibility that a = —oo. The reader 
should observe that, in contrast with Theorem 6.3.1, the following result does not assume 
the differentiability of the functions at the point a. The result asserts that the limiting 
behavior of f(x)/g(x) as x — a+ is the same as the limiting behavior of f’(x)/g’(x) as 
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x — a+, including the case where this limit is infinite. An important hypothesis here is that 
both f and g approach 0 as x — a+. 


6.3.3 L’Hospital’s Rule, I Let —oo < a < b < œ and let f, g be differentiable on (a, b) 
such that g'(x) #0 for all x € (a,b) . Suppose that 


Q fim f(s) =0= lim 809. 
(a) If jim F) =L €R, then lim pam 

at g'(x) x>a+ g(x) 
(b) If lim A Let- o0, co}, then dim SO =. 


Proof. Ifa<a< 6 < b, then Rolle’s Theorem implies that g(8) 4 g(a). Further, by the 
Cauchy Mean Value Theorem 6.3.2, there exists u € (œ, 8) such that 


FÆ) Fla) _ fw) 
g(B) — 8a) g'u) 
Case (a): If L € R and if e > 0 is given, there exists c € (a,b) such that 


(2) 


f'(u) 
L ` <L+e for u€ (a,c), 
eu) oe 
whence it follows from (2) that 
FCB) - f(a) 
3 L-e < <L+e fr a<a<B<e. 
a OEKO 
If we take the limit in (3) as a — a+, we have 
f(B) 
L-¢e<—~<L+e for BE (a,cl. 
g(B) ae 


Since € > 0 is arbitrary, the assertion follows. 
Case (b): If L = +00 and if M > 0 is given, there exists c € (a,b) such that 


d 
fw) for u€ (a,c), 
g'(u) 
whence it follows from (2) that 
(4) FB) = fla) for a<a<B<e. 
(B) — g(a) 
If we take the limit in (4) as a — a+, we have 
fb) > m for f € (a,c). 
8(B) 
Since M > 0 is arbitrary, the assertion follows. 
If L = —oo, the argument is similar. Q.E.D. 


The corresponding theorem for left-hand limits is readily proved in the same manner. 
The result for two-sided limits then follows immediately if both one-sided limits exist and 
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are equal. In the examples that follow, we will apply the appropriate version of L’ Hospital’s 
Rule as needed. 


6.3.4 Examples (a) We have 


. sinx cos x . 
h a | SOE =O: 


Observe that the denominator is not differentiable at x = 0 so that Theorem 6.3.1 
cannot be applied. However f(x) := sin x and g(x) := yx are differentiable on (0, 00) and 
both approach 0 as x — 0+. Moreover, g'(x) Æ 0 on (0, 00), so that 6.3.3 is applicable. 
— cos =! . sinx 


. |1 

(b) We have lim | 2 um 
The quotient in the second limit is again indeterminate in the form 0/0. However, the 
hypotheses are again satisfied so that a second application of L’Hospital’s Rule is 


permissible. Hence, we obtain 


d 1 — cos x sin x . cosx 1 
lim = lim — = li =L 
x—0 x2 x30 2x x30 2 2 
a | x 
(c) We have lim —— = lim& = 1 
x0 x0 | 


. e~—-1-x ex — re eee 
lim z = lim = lim — = 
x0 x x0 x x30 2 
1 1 
(d) We have lim =] = lim ( 2 1 
x—>l1 — 1 xl 1 
L’Hospital’s Rule, II 


This rule is very similar to the first one, except that it treats the case where the denominator 
becomes infinite as x — a+. Again we will consider only right-hand limits, but it is 
possible that a = —oo. Left-hand limits and two-sided limits are handled similarly. 


6.3.5 L’Hospital’s Rule, II Let—co < a < b < œ and let f, g be differentiable on (a, b) 
such that g'(x) #0 for all x € (a,b) . Suppose that 


(5) lim g(x) = too. 
@ if tim EE = Lem, rhen tim £); 
(b) If in E = L€ {—co, oo}, then sim £ =L 


Proof. We will suppose that (5) holds with limit oo. 

As before, we have g(8) 4 g(a) for a, 8 € (a,b),a < p. Further, equation (2) in the 
proof of 6.3.3 holds for some u € (a, 8). 

Case (a): If L € R with L > 0 and e > Ois given, there is c € (a, b) such that (3) in the 
proof of 6.3.3 holds when a <a < $ < c. Since g(x) — 00, we may also assume that 
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g(c) > 0. Taking £ = c in (3), we have 


(6) L-é< — <L+e for a € (a,c). 


Since g(c)/g(a) +0 as a— a+, we may assume that 0 < g(c)/g(a) < 1 for all 
a € (a,c), whence it follows that 


g(a) ~s(c)_, _ 8) . he eens 
7 ea ER 
If we multiply (6) by (g(a) — g(c))/g(a) > 0, we have 
)) f@) 10 . gral, 
mn -olia < say ay © F918). 
( — O0asa — a+, then for any ô with 0 < ô < 1 


Now, since g(c)/g(a) — 0 and f (c)/g(æ) 
there exists d € (a,c) such that 0 < g(c)/g(a) < ê and |f(c)|/g(a) < ô for all æ € (a, d), 
whence (7) gives 


f(@) 
8 L—s)(1-8)-6 < He) +ô. 
(8) (L= e)(1 — ô) RE (L+ 8) 
If we take 6 := min{1, e, ¢/(|L| + 1)}, it is an exercise to show that 
L— 2e < Fle) 5 4 96, 
g(a) 
Since ¢ > 0 is arbitrary, this yields the assertion. The cases L = 0 and L < 0 are handled 
similarly. 


Case (b): If L = +00, let M > 1 be given and c € (a,b) be such that f' (u) /g'(u) > M 
for all u € (a,c). Then it follows as before that 


f(B) ~f(@) 
s(B) — g(a) 
Since g(x) > F as x— a+, we may suppose that c also satisfies g(c) > 0, that 


|f(c)|/g(æ) < 4, and that 0 < g(c ie ) < 4 for all a € (a,c). If we take £ = c in (9) 
and multiply by 1 — g(c)/g(a@) >5, we eh 


fle) -FO a), 
gla) >u(ı eS) > im 


(9) 


for a<a<B<ce. 


so that 


PO) mg O 5 ten for a@€(a,c). 


gla)? g(a) 


Since M > 1 is arbitrary, it follows that lim f(a (a)/g(a) = 


If L = —oo, the argument is similar” “t QED. 


1 
6.3.6 Examples (a) We consider lim ae 
x00 X 


Here f(x) := ln x and g(x) := x on the interval (0, 00). If we apply the left-hand 


l 1 
version of 6.3.5, we obtain lim Ara = lim ie = 0. 
X00 X 


x00 


(b) We consider lim e` xx, 


x7CO 
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Here we take f(x) := x? and g(x) := e* on R. We obtain 


2 
Lae 2 2X . 
lim — = lim — = lim — = 0. 
x= e* x= e* x= e* 
In sin x 


(c) We consider lim . 
x30+ Inx 


Here we take f(x) := In sin x and g(x) := 1n x on (0, 7). If we apply 6.3.5, we obtain 


Insinx ,._ cos x/sinx 


x 
im = lim = lim >= | - [cos x]. 
x>0+ lnx x0+  1/x x>0+ Lsin x 


Since lim [x/sin x] = 1 and lim cos x = 1, we conclude that the limit under considera- 
x—>0+ x—0+ 


tion equals 1. 
(d) Consider lim ae This has indeterminate form oo/oo. An application of 
x>% X + sin x 
L’Hospital’s Rule gives us 
1 — cos x 
lim ———— 
x> 1 + cosx’ 


which is useless because this limit does not exist. (Why not?) However, if we rewrite the 
original limit, we get directly that 


lim ————— = —— = 1. 
X00 sinx 1+0 
1+ 


Other Indeterminate Forms 


Indeterminate forms such as 00 — 00, 0 - 00, 1%, 0°, oo? can be reduced to the previously 
considered cases by algebraic manipulations and the use of the logarithmic and exponential 
functions. Instead of formulating these variations as theorems, we illustrate the pertinent 
techniques by means of examples. 


6.3.7 Examples (a) Let J := (0,7/2) and consider 


, (- 1 ) 
lim {|————], 
x30+\x sinx 


which has the indeterminate form oo — oo. We have 


: 1 1 . sinx—x f cosx— 1 
lim {———— lim ———— = lim ——— 
x—>0+ sin x x>0+ xsinx x—0+ sin x + x cos x 


i; —sin x (0) 
= lim - = ~=0. 
x-0+2cosx—xsinx 2 


(b) Let J:=(0, œ) and consider lim xInx, which has the indeterminate form 


0 - (—oo). We have are 
l 1 
lim xInx = lim ~~ = lim [x = lim (—x) =0 
x—0+ x—0+ 1/x x—0+ —1/x? x—0+ 


(c) Let J := (0, co) and consider Jim x*, which has the indeterminate form 0°. 


— 0+ 
We recall from calculus (see also Section 8.3) that x“ = e*™*. It follows from part (b) 


and the continuity of the function y — e” at y = 0 that lim =P =l. 
x—>0+ 
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(d) Let J := (1,00) and consider lim (1 + 1/x)*, which has the indeterminate form 1°. 
We note that AER 


(10) Cty = er, 


Moreover, we have 


In(1 +1 
kiesni i EN. 
X> x—0O 1/x 
1+1/x)1(—x-? 
=m E (=x Miss ips thd = 
X00 — x72 x00 | + 1/x 


Since y> e” is continuous at y = 1, we infer that lim (1 + 1/x)* =e. 
x—0O 
(e) Let J := (0, co) and consider lim (1 + 1/x)*, which has the indeterminate form 00”. 
x30+ 


In view of formula (10), we consider 


: _ In(1+1/x) i 1 
UPORAN fe ARTE 
Therefore we have lim (1 +1/x)* = e° = 1. 


x—>0+ 


Exercises for Section 6.3 


1. Suppose that f and g are continuous on [a, b], differentiable on (a, b), that c € [a,b] and that 
g(x) #0 for x € [a,b], x # c. Let A := lim f and B := lim g. If B = 0, and if lim f(x)/g(x) 
È Saa 6 A Aami S ama ed 


exists in R, show that we must have A = 0. [Hint: f(x) = {f(x)/g(x)}g(x).] 


2. In addition to the suppositions of the preceding exercise, let g(x) > 0 for x € [a,b], x 4 c. If 
A > Oand B = 0, prove that we must have lim f(x)/g(x) = oo. If A < 0 and B = 0, prove that 
we must have lim f(x)/g(x) = —oo. ed 

a. 


3. Let f(x) := x’sin(1/x) for 0 < x < 1 and f(0) := 0, and let g(x) := x? for x € [0, 1]. Then 
both f and g are differentiable on [0,1] and g(x) > 0 for x 4 0. Show that lim f(x) =0= 
lim g(x) and that lim f(x)/g(x) does not exist. om 
x! x 


4. Let f(x) := x? for x rational, let f(x) := 0 for x irrational, and let g(x) := sin x for x € R. Use 
Theorem 6.3.1 to show that lim f(x)/e(x) = 0. Explain why Theorem 6.3.3 cannot be used. 
x 


5. Let f(x) := xX sin(/x) for x 4 0, let f(0) := 0, and let g(x) := sin x for x € R. Show that 
lim f(x)/g(x) = 0 but that lim f'(x)/g' (x) does not exist. 


6. Evaluate the following limits. 
(a) li evte* —2 (b) li x? — sin2x 
a) lim ————— im ———— 
x0 1—cosx x0 xt 


7. Evaluate the following limits, where the domain of the quotient is as indicated. 


. In(x+ 1) `. tanx 
im ——— 


(a) lim ~~ (0, 2/2). (b) lim (0,7/2), 
. Incosx . tanx—x 
(c) dm x (0, m/2), (d) I ea (0, m/2). 
8. Evaluate the following limits: 
Arctan x 1 

li — j b) lim——. (0,1), 

(a) 0 x ( at ee) o <0 x(In x)? l ) 
3 

(c) lim x Inx (0, 0), (d) lim% (0,00) 


x—0+ x> e? 
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9. Evaluate the following limits: 


l l 
(a) lim => (0, 00), (b) lim => (0, œ), 
x> X X—>00 x 
; ; : +Inx 
(c) lim x In sinx (0, 7), (d) Jim Siny (0, co). 
10. Evaluate the following limits: 
(a) lim x** (0, 0), (b) lim(1+3/x)* (0, 00), 
x—0+ x0 
1 1 
lim (1 +3/x)* , Oo), d) li — — — , OO). 
(©) dim ( +3/x)" (0, œ) @) Ot ę z) Oe) 
11. Evaluate the following limits: 
(a) lim x! (0, 00), (b) lim (sinx)* (0, x), 
X00 x—0+ 
(c) lim x%"* (0, 00), (d) lim (secx—tanx) (0, 2/2). 
x—0+ x=>n/2= 


12. Let f be differentiable on (0, oo) and suppose that lim (f(x) +/’(x)) =L. Show that 
lim f(x) = L and lim f'(x) = 0. [Hint: f(x) = e*f(x)/e*.] 


w= 
t: 
13. Try to use L’Hospital’s Rule to find the limit of ane asx (11 /2)—. Then evaluate directly 
se 


; : f cx 
by changing to sines and cosines. 


P . xe —c 1—Inc 
14. Show that if c > 0, then lim = ; 
xe XX — Ce 1+Inc 


Section 6.4 Taylor’s Theorem 


A very useful technique in the analysis of real functions is the approximation of functions 
by polynomials. In this section we will prove a fundamental theorem in this area that goes 
back to Brook Taylor (1685-1731), although the remainder term was not provided until 
much later by Joseph-Louis Lagrange (1736-1813). Taylor’s Theorem is a powerful result 
that has many applications. We will illustrate the versatility of Taylor’s Theorem by briefly 
discussing some of its applications to numerical estimation, inequalities, extreme values of 
a function, and convex functions. 

Taylor’s Theorem can be regarded as an extension of the Mean Value Theorem to 
“higher order” derivatives. Whereas the Mean Value Theorem relates the values of a 
function and its first derivative, Taylor’s Theorem provides a relation between the values of 
a function and its higher order derivatives. 

Derivatives of order greater than one are obtained by a natural extension of the 
differentiation process. If the derivative f'(x) of a function f exists at every point x in an 
interval J containing a point c, then we can consider the existence of the derivative of the 
function f’ at the point c. In case f’ has a derivative at the point c, we refer to the resulting 
number as the second derivative of f at c, and we denote this number by f”(c) or by 
f(c). In similar fashion we define the third derivative f” (e) = f©)(c),..., and the nth 
derivative f (e), whenever these derivatives exist. It is noted that the existence of the 
nth derivative at c presumes the existence of the (n — l)st derivative in an interval 
containing c, but we do allow the possibility that c might be an endpoint of such an 
interval. 

If a function f has an mth derivative at a point Xo, it is not difficult to construct an 
nth degree polynomial P,, such that P,(x9)=f(xo) and P® (xo) =f“ (xo) for 
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k= 1,2,...,n. In fact, the polynomial 


(1) a(x) =f (0) +f (20) x — 29) +22 (x — 29) 
(y 
tate EOD) — xy)" 


has the property that it and its derivatives up to order n agree with the function f and its 
derivatives up to order n, at the specified point x9. This polynomial P, is called the nth 
Taylor polynomial for f at xo. It is natural to expect this polynomial to provide a 
reasonable approximation to f for points near xo, but to gauge the quality of the 
approximation, it is necessary to have information concerning the remainder 
Rn := f — Pn. The following fundamental result provides such information. 


6.4.1 Taylor’s Theorem Letn €N, let I := [a,b], and let f : I — R be such that f and 


its derivatives f’, f",..., f ” are continuous on Land that f "*!) exists on (a, b). If xo € I, 
then for any x in I there exists a point c between x and xo such that 
f” (x0) 
(2) F(x) = F(%0) +F (xo) (x = xo) +— ( xo)? 
f” (xo) n (o) n+1 
wre n! re) TI F” xo). 


Proof. Let xo and x be given and let J denote the closed interval with endpoints xo and x. 

We define the function F on J by 

(x= 0)" 
n! 


fn) 


for t € J. Then an easy calculation shows that we have 


(x E D" Dey, 


F(t) a n! 


If we define G on J by 


G(t) := F(t) — ( ae J Fox) 


X — Xo 


for t € J, then G(xo) = G(x) = 0. An application of Rolle’s Theorem 6.2.3 yields a point c 
between x and Xo such that 


x— c n 
0=G(c) =F'() + (n+ 1) ( ) F(x) 
(x — xo) 
Hence, we obtain 
1 (x xo)! ; 
F = F 
(xo) n+1 (x-c)" (c) 
= 1 (x pa xo)" (x = o)" p+) Co) Pig) (x = ae 
n+1 (x-c)" n! (n+ 1)! ss 
which implies the stated result. QED. 


We shall use the notation P,, for the nth Taylor polynomial (1) of f, and R,, for the 
remainder. Thus we may write the conclusion of Taylor’s Theorem as f(x) = P,(x) + 
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R,(x) where R, is given by 

FOO 
(n+ 1)! 
for some point c between x and xo. This formula for R, is referred to as the Lagrange form 


(or the derivative form) of the remainder. Many other expressions for R„ are known; one is 
in terms of integration and will be given later. (See Theorem 7.3.18.) 


(x = xo)" 


(3) R(x) = 


Applications of Taylor’s Theorem 


The remainder term R„ in Taylor’s Theorem can be used to estimate the error in 
approximating a function by its Taylor polynomial P„. If the number n is prescribed, 
then the question of the accuracy of the approximation arises. On the other hand, if a certain 
accuracy is specified, then the question of finding a suitable value of n is germane. The 
following examples illustrate how one responds to these questions. 


6.4.2 Examples (a) Use Taylor’s Theorem with n = 2 to approximate y/1 + x, x > —1. 
We take the function f(x) := (1 + x)!®, the point xp = 0, and n = 2. Since f'(x) = 
14x) and f" (x) =4(-3 1 + x) >, we have f'(0) = 4 and f”(0) = —2/9. Thus 
we obtain 
f(x) = Pa(x) + R(x) = 1+ 5x — 92° + R(x), 
where R(x) = 4 f" (e0) =À (1+ c) 8 x3 for some point c between 0 and x. 
For example, if we let x = 0.3, we get the approximation P,(0.3) = 1.09 for V1.3. 


Moreover, since c > 0 in this case, then (1 + o 3 < 1 and so the error is at most 


R3(0.3) < =o (3) = 2647x107. 
~ 81 \10 600 

Hence, we have |\/1.3—1.09| < 0.5 x 107°, so that two decimal place accuracy is assured. 

(b) Approximate the number e with error less than 1075. 

We shall consider the function g(x) := e* and take xy = 0 and x = 1 in Taylor’s 
Theorem. We need to determine n so that |R,(1)| < 107°. To do so, we shall use the fact 
that g'(x) = e* and the initial bound of e* < 3 forO< x < 1. 

Since g'(x) = e*, it follows that g“) (x) = e* for all k € N, and therefore g“)(0) = 1 
for all k € N. Consequently the nth Taylor polynomial is given by 


2, n 


x x 
P,(x) = Pee 


and the remainder for x = 1 is given by R,(1) = e° /(n + 1)! for some c satisfying 0 < c < 1. 
Since e° < 3, we seek a value of n such that 3/(m + 1)! < 1075. A calculation reveals that 
9! = 362, 880 > 3 x 10° so that the value n = 8 will provide the desired accuracy; moreover, 
since 8! = 40, 320, no smaller value of n will be certain to suffice. Thus, we obtain 


1 1 


with error less than 1075. 


Taylor’s Theorem can also be used to derive inequalities. 
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6.4.3 Examples (a) 1 —4x* < cosx for all x € R. 
Use f(x) := cosx and xo = 0 in Taylor’s Theorem, to obtain 


1 
cosx = 1 — 5% + R(x); 


where for some c between 0 and x we have 

ui è 
PO s_sine s 

3! 6 
IfO < x < x, then 0 < c < 7x; since c and xX are both positive, we have Ro(x) > 0. Also, 
if —7 <x < 0, then —x < c < 0; since sin c and x? are both negative, we again have 
R(x) > 0. Therefore, we see that 1 — 5x? < cos x for |x| < x. If |x| > x, then we have 
1- 5x? < —3 < cos x and the inequality is trivially valid. Hence, the inequality holds for 
al x ER. 
(b) For any k € N, and for all x > 0, we have 


R(x) = 


1 1 1 1 
emoa ap Shtesa Ea 


Using the fact that the derivative of In( + x) is 1/(1 + x) for x > 0, we see that the nth 
Taylor polynomial for Ind + x) with xo = 0 is 
1 1 
P, = J aD ioe —1)jrliyn 
(x) =x 57 $e 4 (AIT 
and the remainder is given by 
| n n+l 
R,(x) = (De 
n+l 
for some c satisfying 0 < c < x. Thus for any x > 0, if n = 2k is even, then we have 
Rox(x) > 0; and if n = 2k + 1 is odd, then we have R2,41(x) < 0. The stated inequality 
then follows immediately. 
(c) A>r 
Taylor’s Theorem gives us the inequality e* > 1+ x for x > 0, which the reader 
should verify. Then, since z > e, we have x = m/e — 1 > 0, so that 


xn 


eel) 514 (m/e — 1) = r/e. 
This implies e7/° > (z /e)e = z, and thus we obtain the inequality e” > 1°. 


Relative Extrema 


It was established in Theorem 6.2.1 that if a function f : Z — R is differentiable at a point c 
interior to the interval J, then a necessary condition for f to have a relative extremum at c is 
that f’(c) =0. One way to determine whether f has a relative maximum or relative 
minimum [or neither] at c is to use the First Derivative Test 6.2.8. Higher order derivatives, 
if they exist, can also be used in this determination, as we now show. 


6.4.4 Theorem Let I be an interval, let Xo be an interior point of I, and let n > 2. 
Suppose that the derivatives f', f" ..., f™® exist and are continuous in a neighborhood 
of Xo and that f' (xo) = ++» = fP (xo) = 0, but f (xo) £ 0. 

(i) Tfn is even and f)(xo) > 0, then f has a relative minimum at Xo. 

(ii) If n is even and f(x) < 0, then f has a relative maximum at Xo. 

(iii) If n is odd, then f has neither a relative minimum nor relative maximum at Xo. 
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Proof. Applying Taylor’s Theorem at xo, we find that for x € J we have 


fo) n 
F) = Pri) Raila) = F00) +? (x — 20" 
where c is some point between xo and x. Since f is continuous, if f (n) (xo) Æ 0, then there 
exists an interval U containing xo such that f™ (x) will have the same sign as f™ (xq) for 
x € U. If x € U, then the point c also belongs to U and consequently f™ (c) and f™ (xq) 
will have the same sign. 


(i) If nis even and f™ (xo) > 0, then for x € U we have f™ (c) > O and (x — x9)" > 
0 so that Ry-1(x) > 0. Hence, f(x) > f(xo) for x € U, and therefore f has a relative 
minimum at Xo. 

(ii) If n is even and f (n) (xo) < 0, then it follows that R,-1(x) < 0 for x € U, so that 
f(x) < f(xo) for x € U. Therefore, f has a relative maximum at xo. 

Gii) If n is odd, then (x — xo)" is positive if x > x9 and negative if x < x9. Conse- 
quently, if x € U, then R,_1(x) will have opposite signs to the left and to the right of xo. 
Therefore, f has neither a relative minimum nor a relative maximum at Xo. Q.E.D. 


Convex Functions 


The notion of convexity plays an important role in a number of areas, particularly in the 
modern theory of optimization. We shall briefly look at convex functions of one real 
variable and their relation to differentiation. The basic results, when appropriately 
modified, can be extended to higher dimensional spaces. 


6.4.5 Definition LetZ C R be an interval. A function f : J — R is said to be convex on J 
if for any ¢ satisfying 0 < ¢ < 1 and any points x1, X2 in J, we have 


F(C = t)xi + tx2) < (1 = A f(x) + tf (x2). 


Note that if x; < xo, then as ¢ ranges from 0 to 1, the point (1 — ¢)x; + tx2 traverses 
the interval from x; to x2. Thus if fis convex on Z and if x1, x2 € J, then the chord joining 
any two points (x1, f(x1)) and (x2, f(x2)) on the graph of f lies above the graph of f. (See 
Figure 6.4.1.) 


y= (1- £) f (x1) + if (x2) 


y =f ((1- t) x, + tx2) 


xı (1 = 1) x1 + tx2 x2 
Figure 6.4.1 A convex function 


A convex function need not be differentiable at every point, as the example 
f(x) := |x|, x € R, reveals. However, it can be shown that if Z is an open interval and if 
f : I — Ris convex on J, then the left and right derivatives of f exist at every point of I. Asa 
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consequence, it follows that a convex function on an open interval is necessarily continuous. 
We will not verify the preceding assertions, nor will we develop many other interesting 
properties of convex functions. Rather, we will restrict ourselves to establishing the 
connection between a convex function fand its second derivative f”, assuming that f” exists. 


6.4.6 Theorem Let I be an open interval and let f : I — R have a second derivative on 
I. Then f is a convex function on I if and only if f(x) > 0 for all x € I. 


Proof. (=>) We will make use of the fact that the second derivative is given by the limit 


(4) f(a) = im! 2 + h) H 2f (a) +fla a h) 


h—=0 Ie 


for each a € I. (See Exercise 16.) Given a € J, let h be such that a+ hand a — A belong to 7. 
Then a = }((a+h) + (a — h)), and since fis convex on J, we have 


f(a) =fG(at+h) + ila- h) <hf(ath)+3f(a—- h). 


Therefore, we have f(a + h) — 2f(a) + f(a — h) > 0. Since h? > 0 for all h £ 0, we see 
that the limit in (4) must be nonnegative. Hence, we obtain f”(a) > 0 for any a € I. 

(<) We will use Taylor’s Theorem. Let x), x2 be any two points of J, let 0 < ¢ < 1, 
and let xo := (1 — t)xı + tx. Applying Taylor’s Theorem to f at x9 we obtain a point cı 
between xo and xı such that 


Ff (x1) =f (0) + (x0) (21 — xo) + 4" (e1) (241 — x0), 


and a point c) between xo and xz such that 


F (x2) =F(X0) +f (x0) (x2 — x0) ++ f” (e2)(x2 — x0)”. 


If f” is nonnegative on J, then the term 


R := x(1 — of "(e1)(%1 — xo)? +a" (2) (m2 ~ Xo)” 


is also nonnegative. Thus we obtain 


(= A) f (x1) + of (x2) = f (x0) +f (x0)((1 — fx + x2 — xo) 
+ = Of" (e) (x = x0)” +4 of" Cea) (x2 — xo)? 
=f (x0) +R 
> f (x0) =f((1 — t)x1 + tx). 


Hence, f is a convex function on J. Q.E.D. 


Newton’s Method 


It is often desirable to estimate a solution of an equation with a high degree of accuracy. 
The Bisection Method, used in the proof of the Location of Roots Theorem 5.3.5, provides 
one estimation procedure, but it has the disadvantage of converging to a solution rather 
slowly. A method that often results in much more rapid convergence is based on the 
geometric idea of successively approximating a curve by tangent lines. The method is 
named after its discoverer, Isaac Newton. 
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Let f be a differentiable function that has a zero at r and let x, be an initial estimate of 
r. The line tangent to the graph at (x1, f(x1)) has the equation y = f(x1) +f’(x1)(x — x1), 
and crosses the x-axis at the point 


fx) 
f'(x) 
(See Figure 6.4.2.) If we replace x; by the second estimate x2, then we obtain a point x3, 
and so on. At the mth iteration we get the point x,,, from the point x, by the formula 


f(Xn) 

f' (Xn) 

Under suitable hypotheses, the sequence (x,,) will converge rapidly to a root of the equation 
f(x) = 0, as we now show. The key tool in establishing the rapid rate of convergence is 
Taylor’s Theorem. 


X := X] — 


Xn+1 = Xn — 


y y=f(x) 


(x1, f(x1)) 


Figure 6.4.2 Newton’s Method 


6.4.7 Newton’s Method Let I := [a,b] and let f : I — R be twice differentiable on I. 
Suppose that f(a) f(b) < 0 and that there are constants m, M such that | f' (x)| >m> 0 
and |f"(x)| <M for x€I and let K := M/2m. Then there exists a subinterval T 
containing a zero r of f such that for any x, € I* the sequence (x,) defined by 


f (Xn) 
f' (Xn) 


belongs to I’ and (xn) converges to r. Moreover 


(5) Xn+1 i= Xn — forall néN, 


(6) Xni r| <K\|x,—rl? forall neN. 


Proof. Since f(a) f(b) < 0, the numbers f(a) and f(b) have opposite signs; hence by 
Theorem 5.3.5 there exists r € J such that f(r) = 0. Since f’ is never zero on J, it follows 
from Rolle’s Theorem 6.2.3 that f does not vanish at any other point of J. 


We now let x’ € I be arbitrary; by Taylor’s Theorem there exists a point c’ between x’ 
and r such that 


0 =F) =F) ELA x) 43 FOG XY, 
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from which it follows that 
f(x) =P- x’) HLE- xX. 


If x” is the number defined from x’ by “the Newton procedure”: 


whence it follows that 
1f” (c) 2 
I 1 
x r= F) xry. 


Since c’ € I, the assumed bounds on f’ and f” hold and, setting K := M/2m, we obtain the 
inequality 


(7) |x” = r| <K |x’ — rf’. 


We now choose 5 > 0 so small that 8 < 1/K and that the interval J* := [r — ô, r + ô] is 
contained in Z. If x, €/*, then |x,—r| <ô and it follows from (7) that 
IXnt1 — r| < K|Xn — r? < K8? < 8; hence x, € I* implies that x„}ı € I*. Therefore if 
xı € I*, we infer that x, € I* for all n € N. Also if xı € /*, then an elementary induction 
argument using (7) shows that |x,4; — r| < (K6)"|x,; — r| for n € N. But since Kô < 1 this 
proves that lim(x,) = r. Q.E.D. 


6.4.8 Example We will illustrate Newton’s Method by using it to approximate v2. 
If we let f(x) := x? — 2 for x € R, then we seek the positive root of the equation 
f(x) = 0. Since f'(x) = 2x, the iteration formula is 


F(xn) 
Xn+1 = Xn F(x ) 
E N 
= Xn = Xn 
2Xn 2 n 


If we take xı := 1 as our initial estimate, we obtain the successive values x2 = 3/2 = 1.5, 
x3 = 17/12 = 1.416 666..., x4 = 577/408 = 1.414215..., and x5 = 665 857/470 832 
= 1.414213 562 374..., which is correct to eleven places. 


Remarks (a) If we let e, := x, — r be the error in approximating r, then inequality (6) 
can be written in the form |Ke,,;| <|Ke,|”. Consequently, if |Ke,| <10~” then 
IKenzil < 107?” so that the number of significant digits in Ke, has been doubled. Because 
of this doubling, the sequence generated by Newton’s Method is said to converge 
“quadratically.” 

(b) In practice, when Newton’s Method is programmed for a computer, one often makes 
an initial guess x; and lets the computer run. If x; is poorly chosen, or if the root is too near 
the endpoint of J, the procedure may not converge to a zero of f. Two possible difficulties 
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are illustrated in Figures 6.4.3 and 6.4.4. One familiar strategy is to use the Bisection 
Method to arrive at a fairly close estimate of the root and then to switch to Newton’s 
Method for the coup de grace. 


x2 


Figure 6.4.4 x, oscillates 
Figure 6.4.3 x, — co between x, and x2 


Exercises for Section 6.4 


10. 


11. 


Let f(x) := cosax for x € R where a £ 0. Find f(x) for n E€ N, x € R. 


Let g(x) := |x3| for x € R. Find g'(x) and g”(x) for x € R, and g” (x) for x 4 0. Show that 
g” (0) does not exist. 


Use Induction to prove Leibniz’s rule for the nth derivative of a product: 


aP = (p a. 


k=0 


Show that if x > 0, then 1+ix- 4x? <yvyl+x< 1+4x. 


Use the preceding exercise to approximate \/1.2 and \/2. What is the best accuracy you can be 
sure of, using this inequality? 


Use Taylor’s Theorem with n = 2 to obtain more accurate approximations for v1.2 and V2. 


If x > 0 show that |(1 +x)? — (14 4x —$x")| < (5/81)x°. Use this inequality to approxi- 


mate \/1.2 and \2. 


If f(x) := e*, show that the remainder term in Taylor’s Theorem converges to zero as n — 00, 
for each fixed xọ and x. [Hint: See Theorem 3.2.11.] 


If g(x) := sin x, show that the remainder term in Taylor’s Theorem converges to zero as n — 00 
for each fixed xo and x. 


Let h(x) := e7!/ for x 4 0 and h(0) := 0. Show that h'”)(0) = 0 for all n € N. Conclude that 
the remainder term in Taylor’s Theorem for x9 = 0 does not converge to zero as n — oo for 
x Æ 0. [Hint: By L’Hospital’s Rule, lim h(x) /x* = 0 for any k € N. Use Exercise 3 to calculate 
h” (x) for x £ 0.] acs 


If x € [0,1] and n € N, show that 


x2 x3 _, x" xntl 
In(1 + x) (: ee RUS 1"! \< 5. 


Use this to approximate In 1.5 with an error less than 0.01. Less than 0.001. 


12. 


13. 
14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 
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We wish to approximate sine by a polynomial on [—1, 1] so that the error is less than 0.001. 
Show that we have 


3 5 


sinx —(x-~+— < l for |x| <1. 
6 120 5040 ~ 


Calculate e correct to seven decimal places. 


Determine whether or not x = 0 is a point of relative extremum of the following functions: 
(a) f(x) =r +2, (b) g(x) := sinx — x, 

(c) A(x) := sinx + 4x, (d) k(x) := cosx- 1432’. 

Let f be continuous on [a, b] and assume the second derivative f” exists on (a, b). Suppose that 


the graph of f and the line segment joining the points (a, f(a)) and (b, f(b)) intersect at a point 
(xo, f(xo)) where a < xo < b. Show that there exists a point c € (a,b) such that f”(c) = 0. 


Let J C R be an open interval, let f : 7 + R be differentiable on /, and suppose f” (a) exists at 
a € I. Show that 


flat h) —2fla) + flah) 


h-0 Ie 


Give an example where this limit exists, but the function does not have a second derivative at a. 
Suppose that J C R is an open interval and that f”(x) > 0 for all x € 7. If c € I, show that the 
part of the graph of f on J is never below the tangent line to the graph at (c, f(c)). 
Let J C R be an interval and let c € J. Suppose that f and g are defined on Z and that the 
derivatives f”), g) exist and are continuous on 7. If f(c)=0 and g(c) =0 for 
k=0,1,...,n—1, but g” (c) 4 0, show that 

im £2) f°) 


xe g(x) g(c) 


Show that the function f(x) := x? — 2x — 5 has a zero r in the interval J := [2, 2.2]. If xı := 2 
and if we define the sequence (x,) using the Newton procedure, show that 
|Xn41 — r| < (0.7) |x, — rl’. Show that x4 is accurate to within six decimal places. 


Approximate the real zeros of g(x) := x* — x — 3. 


Approximate the real zeros of A(x) := x? — x — 1. Apply Newton’s Method starting with the 


initial choices (a) xı := 2, (b) xı := 0, (c) xı := —2. Explain what happens. 

The equation In x = x — 2 has two solutions. Approximate them using Newton’s Method. What 
happens if xı := + is the initial point? 

The function f(x) = 8x — 8x? + 1 has two zeros in [0,1]. Approximate them, using Newton’s 
Method, with the starting points (a) xı := l, (b) xı := L Explain what happens. 
Approximate the solution of the equation x = cos x, accurate to within six decimals. 


CHAPTER 7 


THE RIEMANN INTEGRAL 


We have already mentioned the developments, during the 1630s, by Fermat and Descartes 
leading to analytic geometry and the theory of the derivative. However, the subject we 
know as calculus did not begin to take shape until the late 1660s when Isaac Newton 
created his theory of “‘fluxions” and invented the method of “inverse tangents” to find 
areas under curves. The reversal of the process for finding tangent lines to find areas was 
also discovered in the 1680s by Gottfried Leibniz, who was unaware of Newton’s 
unpublished work and who arrived at the discovery by a very different route. Leibniz 
introduced the terminology “calculus differentialis’” and “calculus integralis,’ since 
finding tangent lines involved differences and finding areas involved summations. 
Thus, they had discovered that integration, being a process of summation, was inverse 
to the operation of differentiation. 

During a century and a half of development and refinement of techniques, calculus 
consisted of these paired operations and their applications, primarily to physical problems. 
In the 1850s, Bernhard Riemann adopted a new and different viewpoint. He separated the 
concept of integration from its companion, differentiation, and examined the motivating 
summation and limit process of finding areas by itself. He broadened the scope by 
considering all functions on an interval for which this process of “integration” could 
be defined: the class of “integrable” functions. The Fundamental Theorem of Calculus 
became a result that held only for a restricted set of integrable functions. The viewpoint of 
Riemann led others to invent other integration theories, the most significant being 
Lebesgue’s theory of integration. But there have been some advances made in more 
recent times that extend even the Lebesgue theory to a considerable extent. We will give a 
brief introduction to these results in Chapter 10. 


Bernhard Riemann 
(Georg Friedrich) Bernhard Riemann (1826-1866), the son of a poor 
Lutheran minister, was born near Hanover, Germany. To please his 
father, he enrolled (1846) at the University of Gottingen as a student of 
theology and philosophy, but soon switched to mathematics. He inter- 
rupted his studies at Gottingen to study at Berlin under C. G. J. Jacobi, 
P. G. J. Dirichlet, and F. G. Eisenstein, but returned to Gottingen in 1849 
to complete his thesis under Gauss. His thesis dealt with what are now 
called “Riemann surfaces.” Gauss was so enthusiastic about Riemann’s 
work that he arranged for him to become a privatdozent at Gottingen in 
1854. On admission as a privatdozent, Riemann was required to prove himself by delivering a 
probationary lecture before the entire faculty. As tradition dictated, he submitted three topics, the 
first two of which he was well prepared to discuss. To Riemann’s surprise, Gauss chose that he 
should lecture on the third topic: “On the hypotheses that underlie the foundations of geometry.” 
After its publication, this lecture had a profound effect on modern geometry. 

Despite the fact that Riemann contracted tuberculosis and died at the age of 39, he made 
major contributions in many areas: the foundations of geometry, number theory, real and complex 
analysis, topology, and mathematical physics. 
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We begin by defining the concept of Riemann integrability of real-valued functions 
defined on a closed bounded interval of R, using the Riemann sums familiar to the reader 
from calculus. This method has the advantage that it extends immediately to the case of 
functions whose values are complex numbers, or vectors in the space R”. In Section 7.2, we 
will establish the Riemann integrability of several important classes of functions: step 
functions, continuous functions, and monotone functions. However, we will also see that 
there are functions that are not Riemann integrable. The Fundamental Theorem of Calculus 
is the principal result in Section 7.3. We will present it in a form that is slightly more 
general than is customary and does not require the function to be a derivative at every point 
of the interval. A number of important consequences of the Fundamental Theorem are also 
given. In Section 7.3 we also give a statement of the definitive Lebesgue Criterion for 
Riemann integrability. This famous result is usually not given in books at this level, since 
its proof (given in Appendix C) is somewhat complicated. However, its statement is well 
within the reach of students, who will also comprehend the power of this result. In Section 
7.4, we discuss an alternative approach to the Riemann integral due to Gaston Darboux that 
uses the concepts of upper integral and lower integral. The two approaches appear to be 
quite different, but in fact they are shown to be equivalent. The final section presents 
several methods of approximating integrals, a subject that has become increasingly 
important during this era of high-speed computers. While the proofs of these results 
are not particularly difficult, we defer them to Appendix D. 

An interesting history of integration theory, including a chapter on the Riemann 
integral, is given in the book by Hawkins cited in the References. 


Section 7.1 Riemann Integral 


We will follow the procedure commonly used in calculus courses and define the Riemann 
integral as a kind of limit of the Riemann sums as the norm of the partitions tend to 0. Since we 
assume that the reader is familiar—at least informally—with the integral from a calculus 
course, we will not provide a motivation of the integral, or discuss its interpretation as the 
“area under the graph,” or its many applications to physics, engineering, economics, etc. 
Instead, we will focus on the purely mathematical aspects of the integral. 

However, we first define some basic terms that will be frequently used. 


Partitions and Tagged Partitions 


If J := [a, b] is a closed bounded interval in R, then a partition of Z is a finite, ordered set 
P := (X0, X1,- <- , Xn—1, Xn) Of points in J such that 


a= xo < X1 <i < Xn-1 < Xn =b. 


(See Figure 7.1.1.) The points of P are used to divide J = [a, b] into non-overlapping 
subintervals 


Ii = [xo, X1], In = [x1, x2], - bais] In = [Xn=iy Xn 


po EESE DC ASS Cera) Lames a eee Heme 


a=xo 1 %2 x3 Xn-1 %,=b 


Figure 7.1.1 A partition of [a, b] 
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Often we will denote the partition P by the notation P = {[x;-1,x;]}/_,. We define the 
norm (or mesh) of P to be the number 


(1) PII = max{X1 — X0, X2 — X1,- -3 Xn — Xn-1 }- 


Thus the norm of a partition is merely the length of the largest subinterval into which the 
partition divides [a, b]. Clearly, many partitions have the same norm, so the partition is not 
a function of the norm. 

If a point ¢; has been selected from each subinterval J; = [x;-1, x;], for i = 1, 2,...,n, 
then the points are called tags of the subintervals J;. A set of ordered pairs 


P= {([xi-1, i], t) Fe, 


of subintervals and corresponding tags is called a tagged partition of J; see Figure 7.1.2. 
(The dot over the P indicates that a tag has been chosen for each subinterval.) The tags can 
be chosen in a wholly arbitrary fashion; for example, we can choose the tags to be the left 
endpoints, or the right endpoints, or the midpoints of the subintervals, etc. Note that an 
endpoint of a subinterval can be used as a tag for two consecutive subintervals. Since each 
tag can be chosen in infinitely many ways, each partition can be tagged in infinitely many 
ways. The norm of a tagged partition is defined as for an ordinary partition and does not 
depend on the choice of tags. 


a chet 


a=xo *4 X2 X3 Xn- Xp, = 


Figure 7.1.2 A tagged partition of [a, b] 


If P is the tagged partition given above, we define the Riemann sum of a function 
f : [a, b] — R corresponding to P to be the number 


(2) s(f; P) = SAN — a1). 
i=1 


We will also use this notation when P denotes a subset of a partition, and not the entire 
partition. 

The reader will perceive that if the function fis positive on [a, b], then the Riemann 
sum (2) is the sum of the areas of n rectangles whose bases are the subintervals I; = 
[xj-1, X;] and whose heights are f(t;). (See Figure 7.1.3.) 


Figure 7.1.3 A Riemann sum 
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Definition of the Riemann Integral 
We now define the Riemann integral of a function f on an interval [a, b]. 
7.1.1 Definition A function f : |a, b] — R is said to be Riemann integrable on [a, b] if 


there exists a number L € R such that for every ¢ > 0 there exists 5, > 0 such that if P is 
any tagged partition of [a, b] with ||P||<6,, then 


IS(F; P) — L| <6. 
The set of all Riemann integrable functions on [a, b] will be denoted by Ra, b]. 


Remark Itis sometimes said that the integral L is “the limit” of the Riemann sums 
S(f : P) as the norm ||P|| — 0. However, since S( f; P) is not a function of ||P||, this limit 
is not of the type that we have studied before. 


First we will show that iff € 7[a, b], then the number L is uniquely determined. It will 
be called the Riemann integral of f over [a, b]. Instead of L, we will usually write 


r= fs or [reve 


It should be understood that any letter other than x can be used in the latter expression, so 
long as it does not cause any ambiguity. 


7.1.2 Theorem Jff € Ria, b], then the value of the integral is uniquely determined. 


Proof. Assume that L’ and L” both satisfy the definition and let ¢ > 0. Then there exists 
5,/2 > 0 such that if P, is any tagged partition with ||P;|| < 8/2, then 


IS(f; Pi) -L| < ¢/2. 
Also there exists 82 > 0 such that if P, is any tagged partition with ||P2| < E2 then 
IS(f; Pa) —L"| < 6/2. 


Now let 5, := min{8,/2, 54/2} > 0 and let P be a tagged partition with ||P|| < ê.. Since 
both ||P|| < 6,/2 and IPI < d¢/2, then 


ISG; P) -L| <e/2 and |S(f; P) —L"| < 0/2, 


whence it follows from the Triangle Inequality that 


IL — L'| = |L - S(f; P) + S(f; P) -L"| 
< |E = SG P)| + ISC P) =! 
< e/2 + €/2 =e. 
Since € > 0 is arbitrary, it follows that L’ = L”. Q.E.D. 


The next result shows that changing a function at a finite number of points does not 
affect its integrability nor the value of its integral. 


7.1.3 Theorem If gis Riemann integrable on [a, b] and if f(x) = g(x) except for a finite 
number of points in [a, b], then f is Riemann integrable and J f= : 
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Proof. If we prove the assertion for the case of one exceptional point, then the extension 
to a finite number of points is done by a standard induction argument, which we leave to the 
reader. 

Let c be a point in the interval and let L = f? g. Assume that f(x) = g(x) for all x # c. 
For any tagged partition P, the terms in the two sums S ( Í; P) and S (g; P) are identical 
with the exception of at most two terms (in the case that c = x; = x;—ı is an endpoint). 
Therefore, we have 


SF: P) — S(s; P)| = E) — 8- xil < 2(l@COHIF(ON PI. 


Now, given ¢ > 0, we let 6, > 0 satisfy 5; < €/(4(|f(c)| + |g(©)|), and let 52 > 0 be such 
that ||P|| < 5) implies |S(g; F P) —L| < €/2. We now let 8 := min{é,, 6)}. Then, if 
|P|| < 5, we obtain 


|S(f; P) -L| < |S(f; P) — Se; P)| + |S(g; P) — L| < e/2 + 8/2 = e. 


Hence, the function f is integrable with integral L. Q.E.D. 


Some Examples 


If we use only the definition, in order to show that a function fis Riemann integrable we 
must (i) know (or guess correctly) the value L of the integral, and (ii) construct a 6, that will 
suffice for an arbitrary ¢ > 0. The determination of L is sometimes done by calculating 
Riemann sums and guessing what L must be. The determination of ôs is likely to be 
difficult. 

In actual practice, we usually show that f € R[a, b] by making use of some of the 
theorems that will be given later. 


7.1.4 Examples (a) Every constant function on [a, b] is in Ra, b]. 
Let f(x) := k for all x € [a, b]. If P := {([x;-1, xi], ti) }7_, is any tagged partition of 
[a, b], then it is clear that 


= Sok Xi — x-1) = k(b — a). 


Hence, for any « > 0, we can choose 6, := 1 so that if ||P|| < 6,, then 
IS(f; P) —k(b—a)|=0<e. 


Since € > 0 is arbitrary, we conclude that f € Ra, b] and ff =k(b—a). 

(b) Let g :[0, 3] — R be defined by g(x) :=2for0 < x < 1, and g(x) :=3forl < 
x <3. A preliminary investigation, based on the graph of g (see Figure 7.1.4), suggests 
that we might expect that his g=8. 


Figure 7.1.4 Graph of g 
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Let P be a tagged partition of [0,3] with norm < 4; we will show how to determine ô in 
order to ensure that |S(g; P)-8| < e. Let Pı be the subset of P having its tags in [0,1] 
where g(x) = 2, and let P2 be the subset of P with its tags in (1, 3] where g(x) = 3. It is 
obvious that we have 


(3) S(g; P) = S(g; Pi) + S(g; P2). 
If we let U, denote the union of the subintervals in Pi, then it is readily shown that 
(4) (0, 1— ô| CU, C [0, 1 + ô]. 


For example, to prove the first inclusion, we let u € [0, 1 — ô]. Then u lies in an interval 
Ik := [xk-1; Xx] of Pı, and since ||P|| <5, we have xp — x-1 < 6. Then x-1 < u < 
1 — ô implies that xp < xk-ı +6 < (1 — 8) + ô < 1. Thus the tag tẹ in J; satisfies t% < 1 


and therefore u belongs to a subinterval whose tag is in [0,1], that is, u € U1. This proves the 
first inclusion in (4), and the second inclusion can be shown in the same manner. Since 
g(t.) = 2 for the tags of Pı and since the intervals in (4) have lengths 1 — 6 and 1+ ô, 
respectively, it follows that 


2(1 — 8) < S(g; Pi) < 2(1 + ô). 


A similar argument shows that the union of all subintervals with tags t; € (1, 3] contains 
the interval [1 + 5,3] of length 2 — ô, and is contained in [1 — 6, 3] of length 2+ 6. 
Therefore, 


3(2 — 8) < S(g; P2) < 3(2 + ô). 
Adding these inequalities and using equation (3), we have 
8 — 58 < S(g; P) = S(g; P1) + S(g; P2) < 8 +58, 
whence it follows that 
|S(g:; P) — 8| < 56. 
To have this final term < €, we are led to take 8, < €/5. 

Making such a choice (for example, if we take 6, := €/10), we can retrace the 
argument and see that |S(g; P) = 8| < e when |PI| < ôs. Since € > 0 is arbitrary, we have 
proved that g € R[0, 3] and that i g = 8, as predicted, 

(c) Let A(x) := x for x € [0, 1]; we will show that h € RJO, 1]. 

We will employ a “‘trick”’ that enables us to guess the value of the integral by 
considering a particular choice of the tag points. Indeed, if {7;};_, is any partition of [0,1] 
and we choose the tag of the interval J; = [x;-1, x;] to be the midpoint q; := 5 (xi- 1 +x), 


then the contribution of this term to the Riemann sum corresponding to the tagged partition 


Q = {Ui qi) Fit is 
1 2 


Wg) = 81-1) = 5 a — m1) = 5 (09 — 22). 


If we add these terms and note that the sum telescopes, we obtain 
| a De ee REA. yooh, 1 
SED =e) ol AO 


Now let P := {(Jj, ti) } be an arbitrary tagged partition of [0,1] with ||P|| < ô so that 
x; — Xi-1 < 6 fori=1,...,n. Also let Q have the same partition points, but where we 
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choose the tag q; to be the midpoint of the interval J;. Since both ¢; and q; belong to this 
interval, we have |t; — q;| < 6. Using the Triangle Inequality, we deduce 


SOP — 


< Su- q;|\(Xi — Xi-1) < 5S (xi — Xj-1) = (Xn — Xo) = ô. 


Since S (h; Q) = 5, we infer that if P is any tagged partition with IIPII < 6, then 
. 1 


Therefore we are led to take ô, < e. If we choose ô, := £, we can retrace the argument to 
conclude that h € R[0, 1] and f} A = fe xdx =}. 
(d) Let G(x) := 1/n for x = 1/n (n € N), and G(x) := 0 elsewhere on (0, 1]. 

Given e > 0, the set E := {x : G(x) > e} is a finite set. (For example, if e = 1/10, 
then E = {1, 1/2, 1/3,...,1/10}.) If n is the number of points in E, we allow for the 
possibility that a tag may be counted twice if it is an endpoint and let ô := € /2n. Fora given 
tagged partition P such that ||P|| < 4, we let Po be the subset of P with all tags outside of E 
and let P; be the subset of P with one or more tags in E. Since G(x) < e for each x outside 
of E and G(x) < 1 for all x in [0,1], we get 


0 < S(G; P) = S(G; Po) + S(G; Pi) < e+ (2n)ô = 2e. 


Since ¢ > 0 is arbitrary, we conclude that G is Riemann integrable with integral equal to 
zero. 


Some Properties of the Integral 


The difficulties involved in determining the value of the integral and of 6, suggest that it 
would be very useful to have some general theorems. The first result in this direction 
enables us to form certain algebraic combinations of integrable functions. 


7.1.5 Theorem Suppose that f and g are in Rla, b]. Then: 
(a) If k € R, the function kf is in Rla, b| and 


panels 


(b) The function f + g is in Ria, b| and 


fusea=[s+ fre 


© If f(x) < g(x) for all x€ [a,b], then 


fise 
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Proof. If P = {([xi-1, xi], ti) }4 is a tagged partition of [a, b], then it is an easy exercise 
to show that 


S(kf; P) =kS(f;P), S(f +g; P) =S(f; P) + S(g; P), 
S(f; P) < S(g; P). 


We leave it to the reader to show that the assertion (a) follows from the first equality. 
As an example, we will complete the proofs of (b) and (c). 

Given e > 0, we can use the argument in the proof of the Uniqueness Theorem 7.1.2 to 
construct a number ô; > 0 such that if Pis any tagged partition with IIPII < 6,, then both 


(5) bu- fs 


To prove (b), we note that 


Buts- (fr f e)l = lse» F; P) +S(g; P paps a | 


< |s(f; 2) - T + |se P) - f 


< E/2+€/2 =e. 


b 
<e/2 and ste P) -f (| < 6/2. 


Since e > 0 is arbitrary, we conclude that f + g € R{a, b] and that its integral is the sum of 
the integrals of f and g. 
To prove (c), we note that the Triangle Inequality applied to (5) implies 


b , l b 
J f-92< su?) and sls: P) < | gte/2. 


a 


If we use the fact that S(f; P) < S(g; P), we have 


b b 
Jes] gte. 
a a 


But, since ¢ > 0 is arbitrary, we conclude that fF < f? g. QED. 


Boundedness Theorem 


We now show that an unbounded function cannot be Riemann integrable. 
7.1.6 Theorem Jff € Ria, b], then f is bounded on [a, b]. 


Proof. Assume that f is an unbounded function in R{a, b] with integral L. Then there 
exists ô > 0 such that if P is any tagged partition of [a, b] with ||P|| < ô, then we have 
|S(f; P) — L| < 1, which implies that 


(6) IS(f; P)| < |L] +1. 


Now let Q = {[x;-1, xi] }/_, be a partition of [a, b] with ||Q|| < ô. Since |f| is not bounded 
on [a, b], then there exists at least one subinterval in Q, say[x;—1, Xk], on which |f] is not 
bounded—for, if |f| is bounded on each subinterval [x;_1, x;] by Mj, then it is bounded on 
[a, b] by max{M,,...,M,}. 
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We will now pick tags for Q that will provide a contradiction to (6). We tag Q by 
tı := x; for i#k and we pick t € [Xk-1, xx] such that 


IA E = 4) > 41+ |DO = wi): 
i#k 
From the Triangle Inequality (in the form |A + B| > |A| — 


ISEO) = Leta) xe = x1) = | SOO = x| > +1, 
iZk 


which contradicts (6). Q.E.D. 


We will close this section with an example of a function that is discontinuous at every 
rational number and is not monotone, but is Riemann integrable nevertheless. 


7.1.7 Example We consider Thomae’s function A : [0, 1] — R defined, as in Exam- 
ple 5.1.6(h), by A(x) := Oif x € [0, 1] is irrational, A(0) := 1 and by A(x) := 1/n if x € 
[0, 1] is the rational number x = m/n where m, n € N have no common integer factors 
except 1. It was seen in 5.1.6(h) that 4 is continuous at every irrational number and 
discontinuous at every rational number in [0, 1]. See Figure 5.1.2. We will now show that 
he RO, 1). 

For e > 0, the set E := {x€ [0,1] : h(x) > ¢/2} is a finite set. (For example, if 
e€/2 = 1/5, then there are eleven values of x such that A(x) > 1/5, namely, 
E = {0, 1, 1/2, 1/3, 2/3, 1/4, 3/4, 1/5, 2/5, 3/5, 4/5}. (Sketch a graph.) We let n 
be the number of elements in E and take 6 := ¢/(4n). If Pisa given tagged partition such 
that ||P || < 6, then we separate P into two subsets. We let P4 be the collection of tagged 
intervals in P that have their tags in E, and we let P be the subset of tagged intervals in P 
that have their tags elsewhere in [0, 1]. Allowing for the possibility that a tag of Pi may be 
an endpoint of adjacent intervals, we see that P| has at most 2n intervals and the total 
length of these intervals can be at most 2nd = ¢/2. Also, we have 0 < h(t;) < 1 for each tag 
t; in Pi. Consequently, we have S(h; Pi) < 1- 2nô < £/2. For tags t; in Po, we have 
h(t;) < ¢/2 and the total length of the subintervals in P, is clearly less than 1, so that 
S(h; Pr) < (¢/2) - 1 = 6/2. Therefore, combining these results, we get 


0 < S(h; P) = S(h; P1) + S(h; P2) < e/2+6/2=6. 


Since € > 0 is arbitrary, we infer that h € R[O, 1] with integral 0. 


Exercises for Section 7.1 


1. IfZ:= [0, 4], calculate the norms of the following partitions: 
(a) Pi = (0, 1, 2,4), (b) Pı := (0, 2, 3; 4), 
(c) P3 := (0, 1, 1.5, 2, 3.4, 4), (d) Ps := (0, .5, 2.5, 3.5, 4). 


2. If f(x) := x? for x € [0, 4], calculate the following Riemann sums, where P; has the same 
partition points as in Exercise 1, and the tags are selected as indicated. 
(a) Pı with the tags at the left endpoints of the subintervals. 
(b) P, with the tags at the right endpoints of the subintervals. 
(c) P, with the tags at the left endpoints of the subintervals. 
(d) Pz with the tags at the right endpoints of the subintervals. 


3. Show that f : [a, b] — R is Riemann integrable on [a, b] if and only if there exists L € R such 
that for every € > 0 there exists ôs > 0 such that if P is any tagged partition with norm 
IIPI] < ôs, then |S(f; P) — L| < e. 


10. 


11. 


12. 


13. 


14. 


15. 
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Let P be a tagged partition of [0, 3]. 

(a) Show that the union U; of all subintervals in P with tags in [0, 1] satisfies [0, 1 — ||P||] C 
U, C [0, 1+ ||P]. i . 

(b) Show that the union U2 of all subintervals in P with tags in [1,2] satisfies [1 +||PIl, 
2—||P|] € U2 S [1 — ||P Il, 2 + ||P Il]. 

Let P := {(Ii, ti) y; be a tagged partition of [a, b] and let cı < c2. 

(a) Ifu belongs to a subinterval J; whose tag satisfies c} < t; < c2, show that cı — ||P|| < u < 
C2 + |P||. i . 

(b) Ifv € [a, b] and satisfies cı + ||P|| < v < c2 — ||P||, then the tag t; of any subinterval J; 
that contains v satisfies t; € [c1, c]. 

(a) Letf(x):=2if0 < x< landf(x):= 1if1 < x < 2. Show thatf € R[0, 2] and evaluate 
its integral. 

(b) Leth(x):=2if0 < x< 1,h(1) := 3 and h(x) := 1 if 1 < x < 2. Show that A € R[O, 2] 
and evaluate its integral. 


Use Mathematical Induction and Theorem 7.1.5 to show that if f,,..., f, are in Rla, b] 
n 


and if ki,...,kn E€ R, then the linear combination f = So kif; belongs to Rla, b] and 


b n b i=1 

IFEK] f 
fal a 

If f € Ria, b] and | f(x)| < M for all x € Ja, b], show that ser < M(b—a). 
If f € R{a, b] and if (P,) is any sequence of tagged partitions of [a, b] such that ||P,,|| — 0, 
prove that CF = lim, S(f; Pn). 
Let g(x) := 0 if x € [0, 1] is rational and g(x) := 1/x if x € [0, 1] is irrational. Explain why 
g é RO, 1]. However, show that there exists a sequence (Pn) of tagged partitions of [a, b] such 
that ||P,,|| > 0 and lim, S(g; P,) exists. 


Suppose that fis bounded on [a, b] and that there exists two sequences of tagged partitions of 
[a, b] such that ||P,,|| — 0 and ||Q,,|| — 0, but such that lim, S(f; Pn) # lim, S(f; Òn): Show 
that f is not in Rfa, b]. 

Consider the Dirichlet function, introduced in Example 5.1.6(g), defined by f(x) := 1 for x € 
[0, 1] rational and f(x) := 0 for x € [0, 1] irrational. Use the preceding exercise to show that fis 
not Riemann integrable on [0, 1]. 


Suppose that c < d are points in [a, b]. If g : [a, b] — R satisfies g(x) = a > 0 for x € [c, d] 
and g(x) = 0 elsewhere in [a, b], prove that p € R[a, b] and that iF g = a(d — c). [Hint: Given 
e > 0 let 5, := ¢/4a and show that if ||P|| < ô; then we have a(d — c — 26,) < S(p; P) < 
a(d — c + 26,).| 

Let 0 < a < b, let Q(x) := x? for x € [a, b] and let P := {[x;-1, xi] }_, be a partition of [a, b]. 
For each i, let q, be the positive square root of 


3 (7 + xixi + x71). 


(a) Show that q; satisfies O < xj) < q; < Xi. 

(b) Show that Q(q;) (x; — xi-1) = 3 (x? — x31). 

(c) If Q is the tagged partition with the same subintervals as P and the tags q;, show that 
S(0;0) =(= °). 

(d) Use the argument in Example 7.1.4(c) to show that Q € Ria, b| and 


b b 
Les xXdx =} (b — a’). 


If f € Ria, b] and c E€ R, we define g on [a+ c, b+c] by g(y) := f(y — c). Prove that 


g E€ Rla+c, b+ c] and that Stee = en The function g is called the c-translate of f. 
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Section 7.2 Riemann Integrable Functions 


We begin with a proof of the important Cauchy Criterion. We will then prove the Squeeze 
Theorem, which will be used to establish the Riemann integrability of several classes of 
functions (step functions, continuous functions, and monotone functions). Finally we will 
establish the Additivity Theorem. 

We have already noted that direct use of the definition requires that we know the value 
of the integral. The Cauchy Criterion removes this need, but at the cost of considering two 
Riemann sums, instead of just one. 


7.2.1 Cauchy Criterion A function: [a, b] — R belongs to Ra, b| if and only if for 
every £ > 0 there exists n, > 0 such that if P and Q are any tagged partitions of [a, b] with 
||P || <Ne and \|O]| < Ne» then 


IS(f; P) — S(f;2)| < e. 


Proof. (=>) If f € Ra, b] with integral L, let n, := 5,/2 > 0 be such that if P, Ò are 
tagged partitions such that ||P|| < n, and ||Q|| < ne, then 
IS(f; P) —L|<e/2 and |S(f; Q) —L| < e/2. 
Therefore we have 
IS: P) - S(F:Q)| < |S P) -L+ L- S(F;9)| 
< |S(f: P) - £| + |L-S(F;9)| 
< €/2 + €/2 =e. 


(<) For each n € N, let 5, > 0 be such that if P and Q are tagged partitions with 
norms < ôn, then 


ISGP) -£ (f; Q)| < 1/n. 
Evidently we may assume that 6, > 45,1; for n € N; otherwise, we replace 6, by 
6, = min{ôi, 38n} i 
For each n € N, let P, be a tagged partition with |[Pn|| < ôn. Clearly, if m > n then 
both Pm and P, have norms < ôn, so that 


(1) IS(f; Pn) — S(f; Pmn)| < 1/n for m>n. 


Consequently, the sequence (S ( Í; Paai is a Cauchy sequence in R. Therefore (by 
Theorem 3.5.5) this sequence converges in R and we let A := lim,, S (f ; Pm). 
Passing to the limit in (1) as m — œo, we have 


IS(f; Pa) —Al <1/n for all n EN. 


To see that A is the Riemann integral of f, given ¢ > 0, let K € N satisfy K > 2 /e. If Qis 
any tagged partition with ||Q|| < 5x, then 


IS(f; Q) — A] < [SG Q) — S(O; Pr)| + [S(Fs Px) -l 
<1/K+1/K <e. 
Since € > 0 is arbitrary, then f € R|a, b] with integral A. QED. 


We will now give two examples of the use of the Cauchy Criterion. 
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7.2.2 Examples (a) Let g : [0,3] — R be the function considered in Example 7.1.4(b). 
In that example we saw that if P is a tagged partition of [0, 3] with norm IIPII < 6, then 


8 — 58 < S(g;P) < 8 + 58. 


Hence if Q is another tagged partition with ||Q|| < 4, then 


8 — 58 < S(g; Q) < 8 + 58. 


If we subtract these two inequalities, we obtain 
|S(g;P) = S(g; Q)| < 106. 


In order to make this final term < £, we are led to employ the Cauchy Criterion with 
N := €/20. (We leave the details to the reader.) 
(b) The Cauchy Criterion can be used to show that a function f : [a,b] — R is not 
Riemann integrable. To do this we need to show that: There exists & > 0 such that for any 
n > 0 there exists tagged partitions P and Q with ||P|| < nand ||Q|| < n such that 
IS(F:P) = SCF O)| > oo. 

We will apply these remarks to the Dirichlet function, considered in 5.1.6(g), defined 
by f(x) := 1 if x € (0, 1 is rational and f(x) := 0 if x € [0, 1] is irrational. 

Here we take & := =}. If Pis any partition all of whose tags are rational numbers then 
S (f ; P) = 1, while if Qis any tagged partition all of whose tags are irrational numbers 
then S ( Í; Q) = 0. Since we are able to take such tagged partitions with arbitrarily small 
norms, we conclude that the Dirichlet function is not Riemann integrable. 


The Squeeze Theorem 


In working with the definition of Riemann integral, we have encountered two types of 
difficulties. First, for each partition, there are infinitely many choices of tags. And second, 
there are infinitely many partitions that have a norm less than a specified amount. We have 
experienced dealing with these difficulties in examples and proofs of theorems. We will 
now establish an important tool for proving integrability called the Squeeze Theorem that 
will provide some relief from those difficulties. It states that if a given function can be 
“squeezed” or bracketed between two functions that are known to be Riemann integrable 
with sufficient accuracy, then we may conclude that the given function is also Riemann 
integrable. The exact conditions are given in the statement of the theorem. (The idea of 
squeezing a function to establish integrability led the French mathematician Gaston 
Darboux to develop an approach to integration by means of upper and lower integrals, 
and this approach is presented in Section 7.4.) 


7.2.3 Squeeze Theorem Let f : [a,b] — R. Then f € Ria, b] if and only if for every 
e > 0 there exist functions a, and w; in R{a, b| with 


(2) a(x) < f(x) < @,(x) forall x € fa,b], 


and such that 


(3) T (Ws — Qe) < E. 
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Proof. (=) Take a, = w; =f for all e > 0. 
(<=) Let e > 0. Since a, and w; belong to Ra, b], there exists 6, > 0 such that if P is 
any tagged partition with ||P|| < 6, then 


sta.:P) : fa ce a stout) - fo 

It follows from these inequalities that 
f -e< slas?) and Slo P) <f ote 
In view of ee (2), we have S(a,;P) < S(f;:P) < Slos P), whence 


b b 
f u-ess(ne)< f Wg + &. 


If Q is another tagged partition with ||Q|| < ô., then we also have 


b b 
f a-es < f w; + E. 


If we subtract these two inequalities and use (3), we conclude that 


b b 
SEP -SDI f o f ata 
b 


KE 


= I (Wz — Qe) + 2e a 36. 


a 


Since ¢ > 0 is arbitrary, the Cauchy Criterion implies that f € R[a, b]. QED. 


Classes of Riemann Integrable Functions 


The Squeeze Theorem is often used in connection with the class of step functions. It will be 
recalled from Definition 5.4.9 that a function g : [a,b] — Ris a step function if it has only 
a finite number of distinct values, each value being assumed on one or more subintervals of 
[a, b]. For illustrations of step functions, see Figures 5.4.3 or 7.1.4. 


7.2.4 Lemma If J is a subinterval of (a, b] having endpoints c < d and if y;(x) := 1 for 
x € J and g(x) := 0 elsewhere in [a, b], then y, € Ria, b] and f? gy =d-c. 


Proof. If J = [c,d] with c < d, this is Exercise 7.1.13 and we can choose ô; := ¢/4. 
There are three other subintervals J having the same endpoints c and d, namely, [c, d), 
(c, d], and (c, d). Since, by Theorem 7.1.3, we can change the value of a function at finitely 
many points without changing the integral, we have the same result for these other three 
subintervals. Therefore, we conclude that all four functions g; are integrable with integral 
equal to d — c. Q.E.D. 


It is an important fact that any step function is Riemann integrable. 
7.2.5 Theorem Jfọ : [a,b] — R is a step function, then € RIa, b]. 


Proof. Step functions of the type appearing in 7.2.4 are called “elementary step 
functions.” In Exercise 5 it is shown that an arbitrary step function g can be expressed 
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as a linear combination of such elementary step functions: 
m 
(4) 9 = 5 kj PJ jp? 
j=l 


where J; has endpoints c; < dj. The lemma and Theorem 7.1.5(a, b) imply that g € R[a, b] 
and that 


b m 
(5) / g=) kld- c). QED. 
a yl 


We illustrate the use of step functions and the Squeeze Theorem in the next two 
examples. The first reconsiders a function that originally required a complicated 
calculation. 


7.2.6 Examples (a) The function g in Example 7.1.4(b) is defined by g(x) = 2 for 
0< x< land g(x) = 3 for 1 < x < 3. We now see that g is a step function and therefore 
we calculate its integral to be fog =2-(1—0)+3-(3-1) =2+6=8. 

(b) Let A(x) :=x on [0,1] and let P, := (0, 1/n, 2/n,...,(n—1)/n, n/n = 1). We 
define the step functions a, and w, on the disjoint subintervals [0, 1/n), [1/n, 2/n), 
...;[(n—2)/n), (n—1)/n), [(n— 1)/n, 1] as follows: a(x) := A((k — 1)/n) = 
(k—1)/n for x in [(k—1)/n,k/n) for k=1,2,...,n—1, and a,(x) := 
h((n — 1)/n) = (n—1)/n for x in [(n — 1)/n, 1]. That is, a, has the minimum value 
of h on each subinterval. Similarly, we define w, to be the maximum value of h on each 
subinterval, that is, w(x) := k/n for x in |(k — 1)/n, k/n) for k = 1, 2,..., n—1, and 
wn(x) := 1 for x in |(n — 1)/n, 1]. (The reader should draw a sketch for the case n = 4.) 

Then we get 


fo ME EO EE EG) 
0 n 


5(1=1/n) 
In a similar manner, we also get le On = 5(1 + 1/n). Thus we have 
n(x) < h(x) < olx) 


for x € [0, 1] and 


Since for a given € > 0, we can choose n so that 1 < e, it follows from the Squeeze Theorem 
that ⁄ is integrable. We also see that the value of the integral of / lies between the integrals 
of a, and œ, for all n and therefore has value L. 


We will now use the Squeeze Theorem to show that an arbitrary continuous function is 
Riemann integrable. 


212 CHAPTER 7 THE RIEMANN INTEGRAL 


7.2.7 Theorem Jff : [a,b] — R is continuous on [a, b], then f € Ria, b]. 


Proof. It follows from Theorem 5.4.3 that fis uniformly continuous on [a, b]. Therefore, 
given ¢ > 0 there exists 6, > 0 such that if u, v € [a,b] and |u — v| < 6,, then we have 
IF) — F(0)| < ¢/(b— a). 

Let P = {I;};_; be a partition such that ||P|| < 6. Applying Theorem 5.3.4 we let 
u; € I; be a point where f attains its minimum value on Z; and let v; € J; be a point where f 
attains its maximum value on /;. 

Let a, be the step function defined by a,(x) := f(u;) for x € [x;-1,x;) (i= 1,..., 
n—1)anda,(x) := f (un) for x € [Xn-1, Xn]. Let w; be defined similarly using the points v; 
instead of the u;. Then one has 


at.(x) < f(x) < @,(x) for all x € [a,b]. 


Moreover, it is clear that 


b n 
os f (=a) = DOFO) -FUE i) 


< Der Oa =, 


Therefore it follows from the Squeeze Theorem that f € Ra, b]. QED. 


Monotone functions are not necessarily continuous at every point, but they are also 
Riemann integrable. 


7.2.8 Theorem Jff : [a,b] — R is monotone on [a, b], then f € Ria, b]. 


Proof. Assume that f is increasing on J = [a, b]. Partitioning the interval into n equal 
subintervals I, = [xx-1, xx] gives us xk — xk-1 = (b — a)/n, k = 1,2,...,n. Since f 
is increasing on J;, its minimum value is attained at the left endpoint x,_; and its 
maximum value is attained at the right endpoint x+. Therefore, we define the step 
functions a(x) := f(xx-1) and w(x) := f (xx) for x € [xe-1, XK), k = 1,2,...,n — 1, and 
a(x) := f(Xn-1) and w(x) := f(Xn) for x € [Xn-1, Xn]. Then we have a(x) < f(x) < 
w(x) for all x € J, and 


—a 
[=A rb) +00) + 47-1) 


b —a 
/ o= (F(x) +- +A (%n-1) +f (%n)). 


b ar _ 
fe pee E ea 


Thus for a given € > 0, we choose n such that n > (b — a)(f(b) — f(a))/¢. Then we have 
f? (œw — a) < £ and the Squeeze Theorem implies that f is integrable on Z. Q.E.D. 


7.2 RIEMANN INTEGRABLE FUNCTIONS 213 


The Additivity Theorem 


We now return to arbitrary Riemann integrable functions. Our next result shows that the 
integral is an ‘additive function” of the interval over which the function is integrated. 
This property is no surprise, but its proof is a bit delicate and may be omitted on a first 
reading. 


7.2.9 Additivity Theorem Let f := [a,b] — R and let c € (a,b). Then f € R{a, b] if 
and only if its restrictions to [a, c] and [c, b] are both Riemann integrable. In this case 


(6) ie: = [rs fs 


Proof. (<=) Suppose that the restriction fı of f to [a, c], and the restriction f2 of f to 
[c, b] are Riemann integrable to L, and L, respectively. Then, given € > 0 there exists 5’ > 0 
such that if P; is a tagged partition of [a, c] with ||P,|| < 6’, then |S(f,; P1) — Li] < ¢/3. 
Also there exists 6” > 0 such that if P% is a tagged partition of [c, b] wilh ||P2|| < 5” then 
|S(f2; P2) — Ly| < £/3. If M is a bound for | f|, we define 5, := min{ð', 6”, ¢/6M} and let 
P be a tagged partition of [a, b] with ||Q|| < 6. We will prove that 


(7) IS(f;Q) — (Li + L2)| < e. 


G) Ifcis a partition point of Q, we split Qintoa partition Qi of [a, c] and a partition 
Ò, of [c, b]. Since S(f; Q) = = S(f; Q1) + S(f; Ò), and since Òi has norm < 6’ and On 
has norm < ô”, the inequality (7) is clear. 


(ii) If cis not a partition point in Ò = {(Ix, te) yg, there exists k < m such that 
c € (Xk-1, Xk). We let Òi be the tagged partition of [a, c] defined by 


Qı = {(h, ti), E dary (Ik-1, tk-1), ([xk-1, c); c)}, 
and Q; be the tagged partition of [c, b] defined by 


Q» = eae Xx]; c), (Ik+1, tk+1), iat i (Im, tm)}- 
A straightforward calculation shows that 
S(f; Q) = S(f;Q1) — S(f; Q2) = f (tk) (Xk — Xk-1) — f(c) (Xk — Xk-1) 
= (f(t) =F (6)) < (Œk = xXk-1), 

whence it follows that 

|S(f;Q) — S(f;Q1) — S(f; Q2) | < 2M (xe — xk-1) < €/3. 
But since ||Q,|| < 8 < 6 and ||Ñ2]|| < 6 < 6”, it follows that 

|S(f:Q1) -L| <8/3 and — [|S(f;Q2) -L| < £/3, 


from which we obtain (7). Since ¢ > 0 is arbitrary, we infer that f € R[a, b] and that (6) 
holds. 


(=) We suppose that f € R[a, b] and, given e > 0, we let n, > 0 satisfy the Cauchy 
Criterion 7.2.1. Let fı be the restriction of f to [a, c] and let Pi, Q; be tagged partitions of 
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[a, c] with ||P,|| < n; and || Q11| < n. By adding additional partition points and tags from 
[c, b], we can extend P; and Q; to tagged partitions P and Q of [a, b] that satisfy IPI <Ne 
and ||Q|| < n,. If we use the same additional points and tags in [c, b] for both P and Q, then 


S(f1; Pi) — S(f1;Q1) = S(f; P) — S(f;Q). 


Since both P and Q have norm n,, then IS(fi : Pi) = S(fi ; Q1) | < e. Therefore the Cauchy 
Condition shows that the restriction f, of fto [a, c] is in R[a, c]. In the same way, we see that 
the restriction f> of f to [c, b] is in RIc, d]. 

The equality (6) now follows from the first part of the theorem. Q.E.D. 


7.2.10 Corollary Iff € R|a,b], and if [c,d] C [a,b] , then the restriction of f to [c, d] 
is in R|c,d]. 


Proof. Since f € R|a,b] and c € [a, 5], it follows from the theorem that its restriction 
to [c, b] is in R[c, b]. But if d € [c, b], then another application of the theorem shows that 
the restriction of f to [c,d] is in Rc, d]. QED. 


7.2.11 Corollary Iff € Rla,b] and if a = co < cı < +++ < Gm = b, then the restric- 
tions of f to each of the subintervals {c;-1, ci] are Riemann integrable and 


b m pä 
e 


Until now, we have considered the Riemann integral over an interval [a, b] where 
a < b. It is convenient to have the integral defined more generally. 


7.2.12 Definition Iff € Rj|a,b] and if a, B € [a,b] with a < B, we define 


fr- and [r= 


7.2.13 Theorem Jff € R|a,b] and if a, B,y are any numbers in [a, b], then 


(8) fef rf 


in the sense that the existence of any two of these integrals implies the existence of the 
third integral and the equality (8). 


Proof. If any two of the numbers «œ, 6, y are equal, then (8) holds. Thus we may suppose 
that all three of these numbers are distinct. 
For the sake of symmetry, we introduce the expression 


L(a, 6, y) = fire fire fr 


It is clear that (8) holds if and only if L(a, 8, y) = 0. Therefore, to establish the assertion, 
we need to show that L = 0 for all six permutations of the arguments a, $, and y. 

We note that the Additivity Theorem 7.2.9 implies that L(a, 6, y) = 0 when 
œ< y < B. But it is easily seen that both L(B6,y,a) and L(y,a, p) equal L(a, B, y). 
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Moreover, the numbers 


L(B, a, 7), L(a, y, P), and L(y, B, æ) 


are all equal to —L(a, B, y). Therefore, L vanishes for all possible configurations of these 
three points. QED. 


Exercises for Section 7.2 


16. 


17. 


Let f : [a,b] — R. Show that f ¢ Ra, b] if and only if there exists £o > 0 such that for every 
neN 3. exist tagged partitions P, and Q, with ||P,|| < 1/7 and ||Q,|| < 1/n such that 


|S( (F; P — S(f; Qn) | > £o. 
an the function / defined by A(x) := x+ 1 for x € [0,1] rational, and h(x) := 0 for 
x € [0, 1] irrational. Show that / is not Riemann integrable. 


Let H(x) := k for x = 1/k (k € N) and H(x) := 0 elsewhere on [0, 1]. Use Exercise 1, or the 
argument in 7.2.2(b), to show that H is not Riemann integrable. 

If w(x) := —x and w(x) := x and if a(x) < f(x) < @/(x) for all x € [0, 1], does it follow from 
the Squeeze Theorem 7.2.3 that f € R[0, 1]? 


If J is any subinterval of [a, b] and if g;(x) := 1 for x € J and g,(x) := 0 elsewhere on [a, b], we 
say that g; is an elementary step function on [a, b]. Show that every step function is a linear 
combination of elementary step functions. 


If y : [a,b] — R takes on only a finite number of distinct values, is y a step function? 

If S ( Js P) is any Riemann sum of f : [a,b] — R, show that there exists a step function 
o : [a,b] — R such that i g = S(f; P). 

ne that f is continuous on [a, b], that f(x) > 0 for all x € [a,b] and that vie f = 0. Prove 

that f(x) = 0 for all x € fa, b]. 

Show that the continuity hypothesis in the preceding exercise cannot be dropped. 


If fand g are continuous on [a, b] and if f? f= J : g, prove that there exists c € [a, b] such that 
fle) = gle). 

If fis bounded by M on [a, b] and if the restriction of f to every interval [c, b] where c € (a, b) is 
Riemann integrable, show that f € R{a, b] and that Sf — fF as c > a+. [Hint: Let a,(x) := 
—M and w(x) := M for x € [a,c) and a, (x) := we(x) := f(x) for x € [c, b]. Apply the Squeeze 
Theorem 7.2.3 for c sufficiently near a.] 

Show that g(x) := sin(1/x) for x € (0, 1] and g(0) := 0 belongs to R[0, 1]. 

Give an example of a function f : [a,b] — R that is in Rc, b] for every c € (a, b) but which is 
not in R{a, b]. 

Suppose that f : [a,b] — R, that a= co < c1 < --- < Cm = b and that the restrictions of f to 
[ci_1, ci] belong to R{c;_1, ¢;] for i = 1,...,m. Prove that f € Ra, b] and that the formula in 
Corollary 7.2.11 holds. 

If fis bounded and there is a finite set E such that f is continuous at every point of |a, b]\E, show 
that f € R[a, b]. 

If f is continuous on [a, b], a < b, show that there exists c € [a, b] such that we have f, 4 f= 
f(c)(b — a). This result is sometimes called the Mean Value Theorem for Integrals. 


If fand g are ry on [a, b] and g(x) > 0 for all x € [a, b], show that there exists c € [a, b] 
such that P fg= c) f? g. Show that this conclusion fails if we do not have g(x) > 0. (Note 
that this result is an eee of the preceding exercise.) 
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18. Let fbe continuous on [a, b], let f(x) > 0 for x € [a,b], and let Mn := (fous Show that 
lim(M,,) = sup{ f(x) : x € [a, b]}. 

19. Suppose that a > 0 and that f € R|—a, a]. 
(a) If fis even (that is, if f(—x) = f(x) for all x € [0,a]), show that [“, f = 2 h f 
(b) If fis odd (that is, if f(—x) = —f(x) for all x € [0,a]), show that f“, f = 0. 

20. If fis continuous on [—a, a], show that f", f(x*)dx = 2 fo f(x*)dx. 


Section 7.3 The Fundamental Theorem 


We will now explore the connection between the notions of the derivative and the integral. 
In fact, there are two theorems relating to this problem: one has to do with integrating a 
derivative, and the other with differentiating an integral. These theorems, taken together, 
are called the Fundamental Theorem of Calculus. Roughly stated, they imply that the 
operations of differentiation and integration are inverse to each other. However, there are 
some subtleties that should not be overlooked. 


The Fundamental Theorem (First Form) 


The First Form of the Fundamental Theorem provides a theoretical basis for the method of 
calculating an integral that the reader learned in calculus. It asserts that if a function fis the 
derivative of a function F, and if f belongs to Rla,b], then the integral f? f can be 
calculated by means of the evaluation F ee = F(b) — F(a). A function F such that F’(x) = 
f(x) for all x € fa, b] is called an antiderivative or a primitive of fon [a, b]. Thus, when f 
has an antiderivative, it is a very simple matter to calculate its integral. 

In practice, it is convenient to allow some exceptional points c where F’(c) does not 
exist in R, or where it does not equal f(c). It turns out that we can permit a finite number of 
such exceptional points. 


7.3.1 Fundamental Theorem of Calculus (First Form) Suppose there is a finite set E 
in [a, b] and functions f, F := [a,b] — R such that: 


(a) F is continuous on [a, b], 
(b) F’(x) = f(x) for all x € [a,b]\E, 
(c) f belongs to Ria, 5]. 


Then we have 


b 
(1) f f = F(b) — F(a). 


Proof. We will prove the theorem in the case where E := {a,b}. The general case can 
be obtained by breaking the interval into the union of a finite number of intervals (see 
Exercise 1). 

Let > 0 be given. Since f € R[a, b] by assumption (c), there exists 6, > O such that if 
P is any tagged partition with ||P|| < 6,, then 


(2) sur Pye f "i 


A 


7.3 THE FUNDAMENTAL THEOREM 217 


If the subintervals in P are [xi-1, x;], then the Mean Value Theorem 6.2.4 applied to F on 
[xi-1, x;] implies that there exists u; € (x;-1, x;) such that 

F(x;) — F(xj-1) = F' (ui) - (x — 1-1) for i=1,...,n. 
If we add these terms, note the telescoping of the sum, and use the fact that F’(u;) = f(u;), 
we obtain 


n 


F(b) — F(a) = X (F(xi) — F( (xi-1) = Softw) Xi — 1). 


i=1 


Now let P, := ([xi-1, xi], ui) };_,, so the sum on the right equals S(f; Pa). If we substitute 
F(b) — F(a) = S( f; Py) into (2), we conclude that 


<6. 


F-ra- | F 


But, since ¢ > 0 is arbitrary, we infer that equation (1) holds. Q.E.D. 


Remark If the function F is differentiable at every point of [a, b], then (by Theorem 
6.1.2) hypothesis (a) is automatically satisfied. If fis not defined for some point c € E, we 
take f(c) := 0. Even if F is differentiable at every point of [a, b], condition (c) is not 
automatically satisfied, since there exist functions F such that F’ is not Riemann 
integrable. (See Example 7.3.2(e).) 


7.3.2 Examples (a) If F(x) := =5x for all x € [a, b], then F’(x) = x for all x € [a,b]. 
Further, f = F’ is continuous so it is in R[a, b]. Therefore the Fundamental Theorem (with 
E = 9) implies that 


[xax=Fo) — F(a) =1(0? —a’). 


(b) If G(x) := Arctan x for x € [a,b], then G'(x) = 1/(x? +1) for all x € [a,b]; also 
G’ is continuous, so it is in R[a,b]. Therefore the Fundamental Theorem (with E = Ø) 
implies that 


b 
1 
| ———dx = Arctan b — Arctan a. 
a tl 


(© IfA(x) := |x| for x € [—10, 10], then A’(x) = —1 if x € [—10, 0) and A’(x) = +1 for 
x € (0, 10]. Recalling the definition of the signum function (in 4.1.10(b)), we have A’ (x) = 
sgn(x) for all x € [—10, 10]\{0}. Since the signum function is a step function, it belongs 
to R[—10, 10]. Therefore the Fundamental Theorem (with E = {0}) implies that 


10 
/ sgn(x) dx = A(10) — A(—10) = 10 — 10 = 0. 
—10 

(d) If H(x) := 2,/x for x € [0, b], then H is continuous on [0, b] and H’(x) = 1/./x 
for x € (0, b]. Since h := H’ is not bounded on (0, b], it does not belong to R[0, b] no 
matter how we define (0). Therefore, the Fundamental Theorem 7.3.1 does not apply. 
(However, we will see in Example 10.1.10(a) that A is generalized Riemann integrable 
on [0, b].) 
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(e) Let K(x) := x’cos(1/x?) for x € (0, 1] and let K(0) := 0. It follows from the Product 
Rule 6.1.3(c) and the Chain Rule 6.1.6 that 


K'(x) = 2x cos(1/x?) + (2/x)sin(1/x?) for x € (0, 1]. 


Further, as in Example 6.1.7(e), it can be shown that K’(0) = 0. Thus K is continuous and 
differentiable at every point of [0, 1]. Since it can be seen that the function K’ is not 
bounded on [0, 1], it does not belong to #[0, 1] and the Fundamental Theorem 7.3.1 does 
not apply to K’. (However, we will see from Theorem 10.1.9 that K’ is generalized Riemann 
integrable on [0, 1].) 


The Fundamental Theorem (Second Form) 


We now turn to the Fundamental Theorem (Second Form) in which we wish to differentiate 
an integral involving a variable upper limit. 


7.3.3 Definition Iff € R[a, 5], then the function defined by 
(3) F(z) = f for z € ja, b], 


is called the indefinite integral of f with basepoint a. (Sometimes a point other than a is 
used as a basepoint; see Exercise 6.) 


We will first show that iff € R[a, b], then its indefinite integral F satisfies a Lipschitz 
condition; hence F is continuous on [a, b]. 


7.3.4 Theorem The indefinite integral F defined by (3) is continuous on [a, b]. In fact, if 
|f(x)| < M for all x € a,b], then |F(z) — F(w)| < M|z — w| for all z, w € [a, b]. 


Proof. The Additivity Theorem 7.2.9 implies that if z, w € [a,b] and w < z, then 


=fr-fos+ fram F(w y+ for, 
Fe) Fw) = f f. 


Now if —M < f(x) < M for all x € [a,b], then Theorem 7.1.5(c) implies that 


whence we have 


—M(z — w) < f feme- 


Le 


as asserted. Q.E.D. 


whence it follows that 


|F(z) — F(w)| < 


< M|z — w|, 


We will now show that the indefinite integral F is differentiable at any point where f 
is continuous. 
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7.3.5 Fundamental Theorem of Calculus (Second Form) Let f € R[a, b] and let f be 
continuous at a point c € {a,b| . Then the indefinite integral, defined by (3), is differen- 
tiable at c and F'(c) = f(c). 


Proof. We will suppose that c € [a,b) and consider the right-hand derivative of 
F at c. Since f is continuous at c, given ¢ >0 there exists 7, > 0 such that if 
c<x<c+n,, then 


(4) flo) =e < f(x) <flo) +e 


Let h satisfy 0 < h < n, The Additivity Theorem 7.2.9 implies that fis integrable on the 
intervals [a,c], [a, c+ A] and [c, c+ A] and that 


F(c +h) - j= f” f. 


Now on the interval [c, c + h] the function f satisfies inequality (4), so that we have 
(f(c) —e)-h < F(c+h)-— o= f” f <fl) +e) h. 
If we divide by h > 0 and subtract f(c), we obtain 


F(c +h) — F(c) 
Ja FO _ Fe) 


But, since ¢ > 0 is arbitrary, we conclude that the right-hand limit is given by 
. F(c+h) — F(c) 

ay eee 

It is proved in the same way that the left-hand limit of this difference quotient also equals 
f(c) when c € (a, b], whence the assertion follows. QED. 


If f is continuous on all of [a, b], we obtain the following result. 


7.3.6 Theorem [ff is continuous on [a, b], then the indefinite integral F, defined by (3), 
is differentiable on [a, b] and F'(x) = f(x) for all x € [a,b]. 


Theorem 7.3.6 can be summarized: If f is continuous on [a, b], then its indefinite 
integral is an antiderivative of f. We will now see that, in general, the indefinite integral 
need not be an antiderivative (either because the derivative of the indefinite integral does 
not exist or does not equal f(x)). 


7.3.7 Examples (a) If f(x) :=sgnx on [—1, l], then f € R[-1, 1] and has the 
indefinite integral F(x) := |x| — 1 with the basepoint —1. However, since F’(0) does 
not exist, F is not an antiderivative of f on [—1, 1]. 

(b) If A denotes Thomae’s function, considered in 7.1.7, then its indefinite integral 
H(x) := te h is identically 0 on [0, 1]. Here, the derivative of this indefinite integral exists 
at every point and H’(x) = 0. But H' (x) # h(x) whenever x € QA (0, 1], so that H is not 
an antiderivative of A on [0, 1]. 
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Substitution Theorem 


The next theorem provides the justification for the “change of variable” method that is 
often used to evaluate integrals. This theorem is employed (usually implicitly) in the 
evaluation by means of procedures that involve the manipulation of “differentials,” 
common in elementary courses. 


7.3.8 Substitution Theorem Let J := |œ, B] and let g: J —R have a continuous 
derivative on J. Iff : I — R is continuous on an interval I containing (J), then 


B (B) 
(5) i FAD) -@ (Oat = f fees 


The proof of this theorem is based on the Chain Rule 6.1.6, and will be outlined in 
Exercise 17. The hypotheses that f and y’ are continuous are restrictive, but are used to 
ensure the existence of the Riemann integral on the left side of (5). 


f sinvi 1, 
1 vi 
Here we substitute y(t) := v't for t € [1,4] so that g’(t) = 1/(2v/t) is continuous on 
[1, 4]. If we let f(x) := 2sin x, then the integrand has the „orm (fog)-@ and the 
Substitution Theorem 7.3.8 implies that the integral equals if. i 2sinx dx = —2 cos x= 
2(cos 1 — cos 2). 


7.3.9 Examples (a) Consider the integral 


‘sinvi 5, 
0 a 


Since g(t) := Jt does not have a continuous derivative on [0, 4], the Substitution 
Theorem 7.3.8 is not applicable, at least with this substitution. (In fact, it is not obvious that 
this integral exists; however, we can apply Exercise 7.2.11 to obtain this conclusion. We 
could then apply the Fundamental Theorem 7.3.1 to F(t) := —2 cos yt with E := {0} to 
evaluate this integral.) 


(b) Consider the integral 


We will give a more powerful Substitution Theorem for the generalized Riemann 
integral in Section 10.1. 


Lebesgue’s Integrability Criterion 


We will now present a statement of the definitive theorem due to Henri Lebesgue 
(1875-1941) giving a necessary and sufficient condition for a function to be Riemann 
integrable, and will give some applications of this theorem. In order to state this result, 
we need to introduce the important notion of a null set. 


Warning Some people use the term “null set” as a synonym for the terms “‘empty set” 
or “void set” referring to Ø (= the set that has no elements). However, we will always use 
the term ‘‘null set” in conformity with our next definition, as is customary in the theory of 
integration. 


7.3.10 Definition (a) A setZ C Ris said to be a null set if for every ¢ > 0 there exists a 
countable collection { (ap, bk) }72, of open intervals such that 


Co 


(6) cUi ak, bk) and X (bk — ak) <8. 


k=1 
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(b) If Q(x) is a statement about the point x € J, we say that Q(x) holds almost every- 
where on / (or for almost every x € J), if there exists a null set Z C J such that Q(x) 
holds for all x € J\Z. In this case we may write 


Q(x) forae xeEl. 


It is trivial that any subset of a null set is also a null set, and it is easy to see that the 
union of two null sets is a null set. We will now give an example that may be very 
surprising. 


7.3.11 Example The Q, of rational numbers in [0, 1] is a null set. 

We enumerate Q, = {r1, r2,...}. Given € > 0, note that the open interval Jı := 
(rı — €/4, rı + €/4) contains r; and has length ¢/2; also the open interval Jz := (r2— 
é/8, r2 + €/8) contains r2 and has length ¢/4. In general, the open interval 


_ E E 
Jk := Tk — SET? Tk + Se 


contains the point r; and has length ¢/ 2%. Therefore, the union UX ıJk of these open 


intervals contains every point of Q}; moreover, the sum of the lengths is ys (e / ot ) = 6. 
Since € > 0 is arbitrary, Q; is a null set. k=1 


The argument just given can be modified to show that: Every countable set is a null set. 
However, it can be shown that there exist uncountable null sets in R; for example, the 
Cantor set that will be introduced in Definition 11.1.10. 


We now state Lebesgue’s Integrability Criterion. It asserts that a bounded function on 
an interval is Riemann integrable if and only if its points of discontinuity form a null set. 


7.3.12 Lebesgue’s Integrability Criterion A bounded function f :|a,b] + R is 
Riemann integrable if and only if it is continuous almost everywhere on [a, b]. 


A proof of this result will be given in Appendix C. However, we will apply Lebesgue’s 
Theorem here to some specific functions, and show that some of our previous results follow 
immediately from it. We shall also use this theorem to obtain the important Composition 
and Product Theorems. 


7.3.13 Examples (a) The step function g in Example 7.1.4(b) is continuous at every 
point except the point x = 1. Therefore it follows from the Lebesgue Integrability Criterion 
that g is Riemann integrable. 
In fact, since every step function has at most a finite set of points of discontinuity, then: 
Every step function on [a, b] is Riemann integrable. 
(b) Since it was seen in Theorem 5.6.4 that the set of points of discontinuity of a monotone 
function is countable, we conclude that: Every monotone function on [a, b] is Riemann 
integrable. 
(c) The function G in Example 7.1.4(d) is discontinuous precisely at the points D := 
{1, 1/2,..., 1/n,...}. Since this is a countable set, it is a null set and Lebesgue’s Criterion 
implies that G is Riemann integrable. 
(d) The Dirichlet function was shown in Example 7.2.2(b) not to be Riemann integrable. 
Note that it is discontinuous at every point of [0, 1]. Since it can be shown that the 
interval [0, 1] is not a null set, Lebesgue’s Criterion yields the same conclusion. 
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(e) Let h : [0, 1] — R be Thomae’s function, defined in Examples 5.1.6(h) and 7.1.7. 

In Example 5.1.6(h), we saw that / is continuous at every irrational number and is 
discontinuous at every rational number in [0, 1]. By Example 7.3.11, it is discontinuous on 
a null set, so Lebesgue’s Criterion implies that Thomae’s function is Riemann integrable on 
[0, 1], as we saw in Example 7.1.7. 


We now obtain a result that will enable us to take other combinations of Riemann 
integrable functions. 


7.3.14 Composition Theorem Let f € R|a, b] with f([a, b]) C |c, d] and let ọ: 
[c,d] — R be continuous. Then the composition go f belongs to R{a, b]. 


Proof. If fis continuous at a point u € [a, b], then g o f is also continuous at u. Since the 
set D of points of discontinuity of fis a null set, it follows that the set Dı C D of points of 
discontinuity of g o f is also a null set. Therefore the composition go f also belongs to 
Ria, b]. Q.E.D. 


It will be seen in Exercise 22 that the hypothesis that g is continuous cannot be 
dropped. The next result is a corollary of the Composition Theorem. 


7.3.15 Corollary Suppose that f € Ra, b]. Then its absolute value |f| is in Rla, b], and 


E < f ii< mo =a 


where |f(x)| < M for all x € |a, b]. 

Proof. We have seen in Theorem 7.1.6 that if fis integrable, then there exists M such that 
|f(x)| <M for all x€ [a,b]. Let g(t) := |t| for t € [-M, M]; then the Composition 
Theorem implies that |f| = go f € R[a, b]. The first inequality follows from the fact that 
—|f| <f < |f| and 7.1.5(c), and the second from the fact that | f(x)| < M. QED. 


7.3.16 The Product Theorem [ff and g belong to R{a, b], then the product fg belongs 
to Ria, b]. 


Proof. If y(t) := Ê for t € [—M, M], it follows from the Composition Theorem that 


f? = pof belongs to R{a, b]. Similarly, (f + g)? and 8° belong to R{a, b]. But since we 
can write the product as 


1 
ieas [F+ -P-e 
it follows that fg € R[a, b]. Q.E.D. 


Integration by Parts 


We will conclude this section with a rather general form of Integration by Parts for the 
Riemann integral, and Taylor’s Theorem with the Remainder. 


7.3.17 Integration by Parts Let F,G be differentiable on [a, b] and let f := F' and 
g := C belong to Ria, b] . Then 
b b 
— f Fg. 


(7) f "1G = FG 
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Proof. By Theorem 6.1.3(c), the derivative (F GY exists on [a, b] and 
(FG) = F'G + FG = fG + Fg. 


Since F, G are continuous and f, g belong to 7{[a, b], the Product Theorem 7.3.16 implies 
that fG and Fg are integrable. Therefore the Fundamental Theorem 7.3.1 implies that 


= f'E = fret fFe, 


from which (7) follows. Q.E.D. 


FG 


A special, but useful, case of this theorem is when f and g are continuous on [a, b] and 
F, G are their indefinite integrals F(x) := f7 f and G(x) := f? g. 
We close this section with a version of Taylor’s Theorem for the Riemann Integral. 


7.3.18 Taylor’s Theorem with the Remainder Suppose that f’,..., f™, f”) exist 
on [a, b] and that ft) € RJa, b]. Then we have 


4 (n) 
(8) f(b) = (0) o-a +9 aR, 
where the remainder is given by 
b 
(9) Ry = ff) (b= Nat 


Proof. Apply Integration by Parts to equation (9), with F(t) := f(t) and G(t) := 
(b — t)" /n!, so that g(t) = —(b— t)""'/(n—1)}, to get 


1 t=b 1 g 
R, = — AA. b- A” AA. (b-a dt 
POO E-F +t yf OO e-a 
f” a) wip i = 
= - (b (4). (b — A dt 
Oe a a 
If we continue to integrate by parts in this way, we obtain (8). Q.E.D. 


Exercises for Section 7.3 


1. Extend the proof of the Fundamental Theorem 7.3.1 to the case of an arbitrary finite set E. 


2. Ifn €N and A,,(x) := x"! /(n + 1) for x € [a,b], show that the Fundamental Theorem 7.3.1 
implies that J? x"dx = (b"*! — a"+1) /(n + 1). What is the set E here? 
3. If g(x) := x for |x| > 1 and g(x) := —x for |x| < 1 and if G(x) := }|x? — 1 
f? g(x)dx = G(3) — G(-2) = 5/2. 
4. Let B(x) := —4x° for x < 0 and B(x) := x? for x > 0. Show that f? |x|dx = B(b) — B(a). 
5. Let f: [a,b] — R and let CER. 
(a) If ®: [a,b] — R is an antiderivative of f on [a, b], show that Bc(x) := P(x) + C is also 
an antiderivative of f on [a, b]. 
(b) If ®, and ®, are antiderivatives of f on [a, b], show that ©; — Ẹ®; is a constant function on 


[a, b]. 


, show that 
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6. Iff € R{a,b] and if c € [a,b], the function defined by F.(z) := J” f for z € [a,b] is called the 
indefinite integral of f with basepoint c. Find a eG. ee F, and F}. 


7. We have seen in Example 7.1.7 that Thomae’s function is in R[0, 1] with integral equal to 0. Can 
the Fundamental Theorem 7.3.1 be used to obtain this conclusion? Explain your answer. 


8. Let F(x) be defined for x > 0 by F(x) := (n — 1)x — (n — 1)n/2 for x € [n — 1, n), n € N. 
Show that F is continuous and evaluate F’(x) at points where this derivative exists. Use this 
result to evaluate i. [x]dx for 0 < a < b, where [x] denotes the greatest integer in x, as defined 
in Exercise 5.1.4. 


9. Let f € Ria, d| a define F(x) := f7 f for x € [a,b]. 
(a) Evaluate G(x) := =j fin E of F, where c € [a,b]. 
(b) Evaluate H(x) := =f f in terms of F. 
(c) Evaluate S(x) := -y aut f in terms of F. 

10. Let f : [a,b] > be continuous on s ol and let v : [c,d] — R be differentiable on [c, d] with 

v([c,d]) C [a,b]. If we define G(x =f Pf, show that G'(x) = f(v(x)) - v'(x) for all 

x € [c,d]. 

11. Find F’(x i: when F is defined on [0, 1] by: 
(a) F(x):= = f* ( 1+6) ‘dt. (b) F(x) := f V1+ Padt. 

12. Letf : . 3] — R be defined by f(x) := A <x<l + ) := lforl < x < 2andf(x) := x 
for 2 < x < 3. Obtain formulas for F(x) := =f f and sketch the graphs of f and F. Where is F 


differentiable? Evaluate F’(x) at all i points 


13. eae [0, 3] by g(x) := —1 if0 < x < 2 and g(x) := 1 if2 < x < 3. Find 
the indefinite integral G(x =f g for 0 < x < 3, and sketch the graphs of g and G. Does 
G'(x) = g(x) for all x in i 3]? 


14. Show there does not exist a continuously differentiable function f on [0, 2] such that 
f(0) = —1, f(2) = 4, and f'(x) < 2 for 0 < x < 2. (Apply the Fundamental Theorem.) 


15. Iff : R — R is continuous and c > 0, define g : R —> R by g(x) := | wre f (dt. Show that g is 
differentiable on R and find g'(x). 


16. Iff: [0, 1] > R is continuous and fj f = SF for all x € [0, 1], show that f(x) = 0 for all 
x € [0, 1]. 


17. Use the following argument to prove the Substitution Theorem 7.3.8. Define F (u = fa) 
for u € I, and H(t) := F(g(t)) for t € J. Show that H'(t) = f(A Jg (t) ae teJ ac 


(P) B 
Í 1 Leds = FOB) = HB) = fi Flo(t))o (Hat 


18. Use the Substitution Theorem 7.3.8 to evaluate the following integrals. 


1 
(a) f TF Pas, o fe 1+6) dt = 4/3, 
0 
A 
(c) [ta (d) [hae 2(sin 2 — sin1). 
1 1 


19. Explain why Theorem 7.3.8 and/or Exercise 7.3.17 cannot be applied to evaluate the following 
integrals, using the indicated substitution. 


4 idt E 
(a) ewer, g(t) = Vi, (b) Ma g(t) 


1 1 
T V1+2\t\dt g(t) = |t|, (d) J paa y(t) = Arcsin t. 
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20. (a) If Z, and Z, are null sets, show that Z; U Z> is a null set. 

(b) More generally, if Z,, is a null set for each n € N, show that UZ, is a null set. [Hint: 
Given € > Oandn € N, let {Jt tke N} be a countable collection of open intervals whose 
union contains Z,, and the sum of whose lengths is < ¢/2”. Now consider the countable 
collection {It :n,k € N}.] 

21. Let f,g € R[a,b]. 
(a) Ife R, show that f’ (f +g)? >0. 


(b) Use (a) to show that 2| fi Fal <PH’ fort>o. 
(c) If f f? = 0, show that f’ fg = 0. 


2 2 
(d) Now prove that HO < (i Ifsl) < J) . (e g). This inequality is called the 


Cauchy-Bunyakovsky-Schwarz Inequality (or simply the Schwarz Inequality). 


22. Let h: [0, 1] — R be Thomae’s function and let sgn be the signum function. Show that the 
composite function sgn o h is not Riemann integrable on [0, 1]. 


Section 7.4 The Darboux Integral 


An alternative approach to the integral is due to the French mathematician Gaston Darboux 
(1842-1917). Darboux had translated Riemann’s work on integration into French for 
publication in a French journal and inspired by a remark of Riemann, he developed a 
treatment of the integral in terms of upper and lower integrals that was published in 1875. 
Approximating sums in this approach are obtained from partitions using the infima and 
suprema of function values on subintervals, which need not be attained as function values 
and thus the sums need not be Riemann sums. 

This approach is technically simpler in the sense that it avoids the complications of 
working with infinitely many possible choices of tags. But working with infima and 
suprema also has its complications, such as lack of additivity of these quantities. Moreover, 
the reliance on the order properties of the real numbers causes difficulties in extending the 
Darboux integral to higher dimensions, and, more importantly, impedes generalization to 
more abstract surfaces such as manifolds. Also, the powerful Henstock-Kurzweil approach 
to integration presented in Chapter 10, which includes the Lebesgue integral, is based on 
the Riemann definition as given in Section 7.1. 

In this section we introduce the upper and lower integrals of a bounded function on an 
interval, and define a function to be Darboux integrable if these two quantities are equal. 
We then look at examples and establish a Cauchy-like integrability criterion for the 
Darboux integral. We conclude the section by proving that the Riemann and Darboux 
approaches to the integral are in fact equivalent, that is, a function on a closed, bounded 
interval is Riemann integrable if and only if it is Darboux integrable. Later topics in the 
book do not depend on the Darboux definition of integral so that this section can be 
regarded as optional. 


Upper and Lower Sums 


Let f : I — R be a bounded function on J = [a, b| and let P = (xo, X1,..., Xn) be a 
partition of 7. For k = 1,2,...,n we let 


my := inf { f(x) : x € [xe-1, xx]}, My := sup{ f(x) : x € [xk-1, xx] }- 
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Figure 7.4.2 U(f;P) an upper sum 


The lower sum of f corresponding to the partition P is defined to be 
L(f;P) := S Me (Xk — Xk-1), 
k=1 
and the upper sum of f corresponding to P is defined to be 
U(f; P) = Se = Xk-1). 
k=l 


If f is a positive function, then the lower sum L( f; P) can be interpreted as the area of the 
union of rectangles with base [x,_, x,] and height m,. (See Figure 7.4.1.) Similarly, the upper 
sum U(f; P) can be interpreted as the area of the union of rectangles with base [x,_1, xx] and 
height Mx. (See Figure 7.4.2.) The geometric interpretation suggests that, for a given partition, 
the lower sum is less than or equal to the upper sum. We now show this to be the case. 


7.4.1 Lemma Jf f:=I—R is bounded and P is any partition of l, then 
L(f; P) < UEP): 


Proof. Let P := (xo, X1,..., Xn). Since mp < Mk for k=1,2,...,n and since 
Xk — Xk-ı > 0 for k = 1,2,...,n, it follows that 
LCF; P) = X me (xe — xr-1) < X Mex — Xe-1) = U(f; P). Q.E.D. 
k=1 k=l 
If P := (x0, X1,---,Xn) and Q := (yo, Yi; - -< , Ym) are partitions of J, we say that Q isa 


refinement of P if each partition point x, € P also belongs to Q (that is, if P C Q). A 
refinement Q of a partition P can be obtained by adjoining a finite number of points to P. In 
this case, each one of the intervals [x,_1, x4] into which P divides J can be written as the 
union of intervals whose end points belong to Q; that is, 

[Xk—-1, Xx] = [y15¥)] U Ly Yaa] U- U Dini Yal- 


We now show that refining a partition increases lower sums and decreases upper sums. 
7.4.2 Lemma Jff : I — R is bounded, if P is a partition of I, and if Q is a refinement of 
P, then 

L(F;P) <L(f;Q) and U(f; Q) < U(f; P) 


Proof. Let P = (xo, X1, . .-, Xn). We first examine the effect of adjoining one point to P. 
Let z € I satisfy x-1 < Z < x, and let P’ be the partition 


k eso 
P := (Xoias als Z, e EE 
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obtained from P by adjoining z to P. Let m, and mj be the numbers 
mi, := inf{ f(x) : x € [xe-1,z]}, mọ = inf { f(x) : x € |z, xx] }. 
Then mg < mj, and mg < mj (why?) and therefore 
Me (Xk — Xk-1) = MZ — xk-1) + Mg(Xk — Z) < My (Z — XK-1) +(x — Z). 


If we add the terms m; (x; — Xj- 1) for 7 #k to the above inequality, we obtain 
EGP < EGP: 

Now if Q is any refinement of P (i.e., if P C Q), then Q can be obtained from P by 
adjoining a finite number of points to P one at a time. Hence, repeating the preceding 
argument, we infer that L(f;P) < L(f; Q). 

Upper sums are handled similarly; we leave the details as an exercise. QED. 


These two results are now combined to conclude that a lower sum is a/ways smaller 
than an upper sum even if they correspond to different partitions. 


7.4.3 Lemma Let f: I — R be bounded. If Pı, Pz are any two partitions of I, then 
L(f;P1) < U(f; P2). 


Proof. Let Q := Pı UP? be the partition obtained by combining the points of P; and P2. 
Then Q is a refinement of both P, and P2. Hence, by Lemmas 7.4.1 and 7.4.2, we conclude 
that 


L(f; P1) < L(f;Q) < U(f; Q) < U(f; Pa). QED. 


Upper and Lower Integrals 


We shall denote the collection of all partitions of the interval J by A(1). If f : I — R is 
bounded, then each P in A(/) determines two numbers: L(f;P) and U(f;P). Thus, the 
collection A(I) determines two sets of numbers: the set of lower sums L(f;P) for 
P € P(I), and the set of upper sums U(f; P) for P € P(I). Hence, we are led to the 
following definitions. 


7.4.4 Definition Let J := [a, b] and let f : I — R be a bounded function. The lower 
integral of f on Z is the number 


L(f) = sup{L(f;P) : P € AW}, 
and the upper integral of f on / is the number 
U(f) := inf{U(f;P) : P € PI}. 
Since f is a bounded function, we are assured of the existence of the numbers 
mı := inf{ f(x):x €I} and M; := sup{f(x):x € J}. 
It is readily seen that for any P € P(I), we have 
m,(b— a) < L(f; P) < U(f; P) < Mı(b — a). 
Hence it follows that 
(1) m(b—a)<L(f) and U(f) <M)(b—-a). 


The next inequality is also anticipated. 
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7.4.5 Theorem Let / = |a, b] and let f : I — R be a bounded function. Then the lower 
integral L( f) and the upper integral U(f) of f on I exist. Moreover, 


(2) L(f) < U(f). 


Proof. If Pı and P3 are any partitions of J, then it follows from Lemma 7.4.3 that 
L(f; P1) < U(f;P2). Therefore the number U(f;P2) is an upper bound for the set 
{L(f;P) : P € P(I)}. Consequently, L(f), being the supremum of this set, satisfies 
L(f) < U(f; P2). Since P3 is an arbitrary partition of J, then L( f) is a lower bound for the 
set {U(f;P): P € P(I)}. Consequently, the infimum U(f) of this set satisfies the 
inequality (2). QED. 


The Darboux Integral 


If J is a closed bounded interval and f : Z — R is a bounded function, we have proved in 
Theorem 7.4.5 that the lower integral L(f) and the upper integral U(f) always exist. 
Moreover, we always have L(f) < U(f). However, it is possible that we might have 
L(f) < U(f), as we will see in Example 7.4.7(d). On the other hand, there is a large class 
of functions for which L(f) = U(f). 


7.4.6 Definition Let J = |a, b] and let f : I — R be a bounded function. Then fis said 
to be Darboux integrable on / if L( f) = U(f). In this case the Darboux integral of fover 
I is defined to be the value L(f) = U(f). 


Thus we see that if the Darboux integral of a function on an interval exists, then the 
integral is the unique real number that lies between the lower sums and the upper sums. 

Since we will soon establish the equivalence of the Darboux and Riemann integrals, 
we will use the standard notation S f or S, f(x) dx for the Darboux integral of a function 
f on [a, 6]. The context should prevent any confusion from arising. 


7.4.7 Examples (a) A constant function is Darboux integrable. 

Let f(x) := c for x € I := |a, b]. If P is any partition of J, it is easy to see that 
L(f;P) = c(b — a) = U(f;P) (See Exercise 7.4.2). Therefore the lower and upper 
integrals are given by L(f) = c(b — a) = U(f). Consequently, f is integrable on J and 
he: = f? cdx = c(b — a). 

(b) Let g be defined on [0, 3] as follows: g(x) :=2 if 0<x< 1 and g(x) := 3 if 
2<x<3. (See Example 7.1.4(b).) For ¢>0, if we define the partition 
P, := (0, 1, 1 + €, 3), then we get the upper sum 


Ulg; P) =2. (1—0) + 3(1+e— 1) +3(2— £) =24+ 3e+6—3e=8. 
Therefore, the upper integral satisfies U(g) < 8. (Note that we cannot yet claim 


equality because U(g) is the infimum over all partitions of [0, 3].) Similarly, we get 
the lower sum 


L(g;P;) =2+2e+3(2—e)=8-—e, 


so that the lower integral satisfies L(g) > 8. Then we have 8 < L(g) < U(g) < 8, and 
hence L(g) = U (g) = 8. Thus the Darboux integral of g is J g=8. 


7.4 THE DARBOUX INTEGRAL 229 


(c) The function h(x) := x is integrable on [0, 1]. 
Let P,, be the partition of J := [0, 1] into n subintervals given by 


12 —1 
Pa := (o. ’ faite Z=1). 
nn 


n n 


Since h is an increasing function, its infimum and supremum on the subinterval 
[(k — 1)/n, k/n] are attained at the left and right end points, respectively, and are thus 
given by m; = (k —1)/n and My = k/n. Moreover, since x, — xx¢-1 = 1/n for all 
k=1,2,...,n, we have 

L(h; Pn) = (0+1 +-+ (n= 1))/, UA; Pa) = (142+ n). 
If we use the formula 1 +2 +- -+m = m(m + 1)/2, for m € N, we obtain 


L(h; Pn) = toast s, U(h; Pr) Se =5 


Since the set of partitions {P,, : n E€ N} is a subset of the set of all partitions of P(T) of J, 
it follows that 


= = sup{L(h; Pn) : n E N} < sup{L(h; P) : P € AD} = Lh), 


and also that 


U(h) = inf{U(h; P) : P € AD} < inf{U(h; Pa) : n € N} = 7 


Since 4 < L(h) < U(h) < 4, we conclude that a ) = U(h) = 4. Therefore h is 
Dahon integrable on J = [0, 1] and h h= h xdx=}. 
(d) A nonintegrable function. 

Let I := [0, 1] and let f : Z — R be the Dirichlet function defined by 


f(x) :=1 for x rational, 
:= 0 for x irrational. 
If P := (xo, X1,---,Xn) is any partition of [0, 1], then since every nontrivial interval contains 


both rational numbers and irrational numbers (see the Density Theorem 2.4.8 and its 
corollary), we have m = 0 and Mx = 1. Therefore, we have L( f; P) =0, U(f;P) = 1, 
for all P € P(I), so that L(f) = 0, U(f) = 1. Since L(f) 4 U(f), the function f is not 
Darboux integrable on [0, 1]. 


We now establish some conditions for the existence of the integral. 


7.4.8 Integrability Criterion Let I := |a, b] and let f: I > R be a bounded function 
on I. Then f is Darboux integrable on I if and only if for each e > 0 there is a partition P, of 
I such that 


(3) U(f; Pe) — Lf; Pe) <€ 


Proof. If f is integrable, then we have L(f) = U(f). If ¢ > 0 is given, then from the 
definition of the lower integral as a supremum, there is a partition Pı of J such that L( f) — 
e/2 < L(f;P1). Similarly, there is a partition P2 of J such that U(f; P2) < U(f) + €/2. 
If we let P, := Pı U P2, then P, is a refinement of both Pı and P2. Consequently, by 
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Lemmas 7.4.1 and 7.4.2, we have 
L(f) — 2/2 < L(f;P1) < LF; P.) 
< U(f; Pa) < U(f; P2) < U(f) + €/2. 


Since L( f) = U(f), we conclude that (3) holds. 
To establish the converse, we first observe that for any partition P we have 
L(f;P) < L(f) and U(f) < U(f; P). Therefore, 


U(f) —L(f) < US; P) — Lf; P). 


Now suppose that for each ¢ > 0 there exists a partition P, such that (3) holds. 
Then we have 


U(f) > L(f) < U(f; P) = LCP: Pa) <6. 


Since ¢ > 0 is arbitrary, we conclude that U(f) < L(f). Since the inequality L( f) < U(f) 
is always valid, we have L(f) = U(f). Hence f is Darboux integrable. QED. 


7.4.9 Corollary LetI = [a, b] and let f : I — R be a bounded function. If {P, : n € N} 
is a sequence of partitions of I such that 


lim (U(f; P,,) = L(f; P,,)) =0, 
then f is integrable and lim L( f; Pa) = ie = lim, U(f; Pn). 


Proof. If ¢ > 0 is given, it follows from the hypothesis that there exists K such that if 
n> K then U(f;P,) —L(f;Pn) <€, whence the integrability of f follows from the 
Integrability Criterion. We leave the remainder of the proof as an exercise. QED. 


The significance of the corollary is the fact that although the definition of the Darboux 
integral involves the set of all possible partitions of an interval, for a given function, the 
existence of the integral and its value can often be determined by a special sequence of 
partitions. 

For example, if A(x) = x on [0, 1] and P, is the partition as in Example 7.4.7(c), then 

lim(U (h; Pa) — L(h; Pa)) = lim 1/n = 0 


and therefore f) x dx = lim U(h;P,) = lim $(1+1/n) =3. 


Continuous and Monotone Functions 


It was shown in Section 7.2 that functions that are continuous or monotone on a closed 
bounded interval are Riemann integrable. (See Theorems 7.2.7 and 7.2.8.) The proofs 
employed approximation by step functions and the Squeeze Theorem 7.2.3 as the main 
tools. Both proofs made essential use of the fact that both continuous functions and 
monotone functions attain a maximum value and a minimum value on a closed bounded 
interval. That is, if fis a continuous or monotone function on [a, b], then for a partition 
P = (Xo, X1, . - - , Xn), the numbers Mg = sup{ f(x) : x € Ip} and my = inf { f(x) :x € Ik}, 
k= 1, 2,...,n, are attained as function values. For continuous functions, this is Theorem 
5.3.4, and for monotone functions, these values are attained at the right and left endpoints 
of the interval. 

If we define the step function œw on [a, b] by w(x) := M; for x € |xk-1, xx) for 
k =1, 2,...,n—1, and w(x) := M, for x € [Xn-1, Xn], then we observe that the Riemann 
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integral of w is given by o= So Mk (xx — xk-1). (See Theorem 7.2.5.) Now we 
k=1 


recognize the sum on the right as the upper Darboux sum U(f; P), so that we have 
b n 
| w= XO Mk (xx = Xe-1) = U(F;P). 
a k=l 


Similarly, if the step function œ is defined by a(x) := mx for x € |xk-1;, Xk) k =1, 
2,...,n— 1, and a(x) := m, for x € [Xn-1, Xn], then we have the Riemann integral 


b n 
i. a = X. mg (Xe — xk) = Lf; P). 
a k=1 


Subtraction then gives us 


b n 

f=) = te -m = se) = UFP) = H; P), 
a k=1 

We thus see that the Integrability Criterion 7.4.8 is the Darboux integral counterpart to the 

Squeeze Theorem 7.2.3 for the Riemann integral. 

Therefore, if we examine the proofs of Theorems 7.2.7 and 7.2.8 that establish the 
Riemann integrability of continuous and monotone functions, respectively, and replace the 
integrals of step functions by the corresponding lower and upper sums, then we obtain 
proofs of the theorems for the Darboux integral. (For example, in Theorem 7.2.7 for 
continuous functions, we would have a(x) = f(u;) = m; and w,(x) = f(vi) = M; and 
replace the integral of w; — a, with U(f; P) — L(f;P).) 

Thus we have the following theorem. We leave it as an exercise for the reader to write 
out the proof. 


7.4.10 Theorem If the function f on the interval J = [a, b] is either continuous or 
monotone on J, then f is Darboux integrable on 7. 


The preceding observation that connects the Riemann and Darboux integrals plays a 
role in the proof of the equivalence of the two approaches to integration, which we now 
discuss. Of course, once equivalence has been established, then the preceding theorem 
would be an immediate consequence. 


Equivalence 


We conclude this section with a proof that the Riemann and Darboux definitions of the 
integral are equivalent in the sense that a function on a closed, bounded interval is Riemann 
integrable if and only if it is Darboux integrable, and their integrals are equal. This is not 
immediately apparent. The Riemann integral is defined in terms of sums that use function 
values (tags) together with a limiting process based on the length of subintervals in a 
partition. On the other hand, the Darboux integral is defined in terms of sums that use 
infima and suprema of function values, which need not be function values, and a limiting 
process based on refinement of partitions, not the size of subintervals in a partition. Yet the 
two are equivalent. 

The background needed to prove equivalence is at hand. For example, if a function is 
Darboux integrable, we recognize that upper and lower Darboux sums are Riemann 
integrals of step functions. Thus the Integrability Criterion 7.4.8 for the Darboux integral 
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corresponds to the Squeeze Theorem 7.2.3 for the Riemann integral in its application. In 
the other direction, if a function is Riemann integrable, the definitions of supremum and 
infimum enable us to choose tags to obtain Riemann sums that are as close to upper and 
lower Darboux sums as we wish. In this way, we connect the Riemann integral to the upper 
and lower Darboux integrals. The details are given in the proof. 


7.4.11 Equivalence Theorem A function f on J = [a, b] is Darboux integrable if and 
only if it is Riemann integrable. 


Proof. Assume that fis Darboux integrable. For ¢ > 0, let P, be a partition of [a, b] such 
that U(f; Pa) — L(f; P) < e. For this partition, as in the preceding discussion, we define 
the step functions a, and œ, on [a,b] by a,(x):=m, and @,(x) := My, for 
x € [xp-1, Xk), K=1, 2,...,n— 1, and a,(x) := my, w(x) := M, for x € [Xn-1, Xn], 
where, as usual, Mx is the supremum and m, the infimum of f on Iy = [xx-1, Xx]. Clearly 
we have 


(4) a(x) < f(x) < w(x) forall xin [a, b]. 


Moreover, by Theorem 7.2.5, these functions are Riemann integrable and their integrals are 
equal to 


n 


b 7 b 
(5) I w= Y Male — xi) = UC; P), i wes mop SL EON: 


H=] k=1 
Therefore, we have 


b 
j. cte AP) EPE 


By the Squeeze Theorem 7.2.3, it follows that fis Riemann integrable. Moreover, we note 
that (4) and (5) are valid for any partition P and therefore the Riemann integral of f lies 
between L(f; P) and U(f; P) for any partition P. Therefore the Riemann integral of f is 
equal to the Darboux integral of f. 

Now assume that f is Riemann integrable and let A = f? f denote the value of the 
integral. Then, f is bounded by Theorem 7.1.6, and given ¢ > 0, there exists ô > 0 such that 
for any tagged partition P with |PI| < ô, we have IS(f; P) = A| < £, which can be written 


(6) A—e<S(f;P)<A+e. 


If P = (x0, X1, ..-, Xn), then because M; = sup{f (x) : x € Ip} is a supremum, we can 
choose tags fk in J; such that f(t) > Mp -— £€/(b— a). Summing, and noting that 


n 
X (xk — xk-1) = b — a, we obtain 
k=1 


(7) S(f; P)=> Cte) >> Mux — X¢-1)—¢ = U(fsP)—e > U(f)—e. 
=] =l 
Combining inequalities (6) and (7), we get 


A+e>S(f;P) > U(f) —e, 


and hence we have U(f) < A + 2e. Since ¢ > 0 is arbitrary, this implies that U(f) < A. 

In the same manner, we can approximate lower sums by Riemann sums and show that 
L(f) > A — 2e for arbitrary ¢ > 0, which implies L( f) > A. Thus we have obtained the 
inequality A < L(f) < U(f) < A, which gives us L(f) = U(f) =A = (Pf. Hence, the 
function f is Darboux integrable with value equal to the Riemann integral. QED. 
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Exercises for Section 7.4 


1. Let f(x) := |x| for —1 < x < 2. Calculate L(f;P) and U(f;P) for the following partitions: 
(a) Pi = (- 1,0, 1,2), (b) Po = (- 1,-1/2, 0, 1/2, l, 3/2, 2). 


2. Prove if f(x) := c for x € [a, b], then its Darboux integral is equal to c(b — a). 


3. Let fand g be bounded functions on J := [a, b]. If f(x) < g(x) for all x € J, show that L( f) < 
L(g) and U(f) < U(8). 


4. Let f be bounded on [a, b] and let k > 0. Show that L(kf) = kL( f) and U(kf) = kU(f). 


Let f, g, h be bounded functions on J := [a, b] s such that f(x) < g(x) < h(x) for all x € I. Show 
that if f and h are Darboux integrable and if J? f= J h, then g is also Darboux integrable with 
Lea 

6. Let f be defined on [0, 2] by f(x) := 1 if x 4 1 and f(1) := 0. Show that the Darboux integral 
exists and find its value. 


7. (a) Prove that if g(x) := 0 for 0 < x < $ and g(x) := 1 for $ <x <1, then the Darboux 
integral of g on [0, 1] is equal to 5. 
(b) Does the conclusion hold if we change the value of g at the point 5 to 13? 
8. Let fbe continuous on J := [a,b] and assume f(x) > 0 for all x € J. Prove if L( f) = 0, then 
f(x) =0 for allx € Z. 


9. Let f, and f, be bounded functions on fa, b]. Show that L(f;) + L(f2) < L(f; + fo). 
10. Give an example to show that strict inequality can hold in the preceding exercise. 


11. Iffis a bounded function on [a, b] such that f(x) = 0 except for x in {c), co,..., Cn} in [a, b], 
show that U(f) = L(f) = 0. 


12. Let f(x) = x? for0 < x < 1. For the partition P, := (0, 1/n, 2/n,...,(n— 1)/n, 1), calculate 
L(f,Pn) and U(f,Pn), and show that L(f)=U(f)=4. (Use the formula 1° +2? 
H- +m? = im(m + 1)(2m + 1).) 

13. Let P, be the partition whose existence is asserted in the Integrability Criterion 7.4.8. Show that 


if P is any refinement of P,, then U(f;P) — L(f;P) <€ 


14. Write out the proofs that a function fon [a, b] is Darboux integrable if it is either (a) continuous, 
or (b) monotone. 


15. Let f be defined on J := fa, b] and assume that f satisfies the Lipschitz condition 
|f(x) -f| < K|x - N for all x,y in Z. If P, is the partition of J into n equal parts, show 


that 0 < U(f; Pn) — JPF < K(b (b —a)*/n. 


Section 7.5 Approximate Integration 


The Fundamental Theorem of Calculus 7.3.1 yields an effective method of evaluating the 

integral f? f provided we can find an antiderivative F such that F'(x) = f(x) when 

x € |a, b]. However, when we cannot find such an F, we may not be able to use the 

Fundamental Theorem. Nevertheless, when fis continuous, there are a number of techniques 

b 

for approximating the Riemann integral J; f by using sums that resemble the Riemann sums. 
One very elementary procedure to a quick estimates of f? f, based on Theorem 

7.1.5(c), is to note that if g(x) < f(x) < A(x) for all x € [a, b], then 


fesfrs fo 
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If the integrals of g and h can be calculated, then we have bounds for JS : f. Often these 
bounds are accurate enough for our needs. 

For example, suppose we wish to estimate the value of f e dx. It is easy to show 
that e* <e™ < 1forx € [0, 1], so that 


1 1 1 
f e*dx< | e* dx < | l dx. 
0 0 0 


Consequently, we have 1 — 1/e < h e=% dx < 1. If we use the mean of the bracketing 
values, we obtain the estimate 1 — 1/2e ~ 0.816 for the integral with an error less than 
1/2e < 0.184. This estimate is crude, but it is obtained rapidly and may be quite 
satisfactory for our needs. If a better approximation is desired, we can attempt to find 
closer approximating functions g and h. 

Taylor’s Theorem 6.4.1 can be used to approximate e} by a polynomial. In using 
Taylor’s Theorem, we must get bounds on the remainder term for our calculations to have 
significance. For example, if we apply Taylor’s Theorem to e™” for0 < y < 1, we get 


1 1 
Yuoqjr- page a) R 
e yt5y 6> + R3, 


where R3 = yte~* /24 where c is some number with 0 < c < 1. Since we have no better 
information as to the location of c, we must be content with the estimate 0 < R3 < yt /24. 
Hence we have 


1 1 
e* =1 =x + 5x" — gx + Rs, 


where 0 < R3 < x°/24, for x € [0, 1]. Therefore, we obtain 


Io : 1 1 : 
f e~ dx = i (1 —x4=x4 A zx*)ax+ f R3dx 
0 0 2 6 0 


=1 b : : +R d 
7 ae OTE Sy e 
1 1 
Since we have 0 < He R3dx < 9.94 = a6 < 0.005, it follows that 


ee 26 
= dx x <= (= 0.7429 
l e x 35 ( ), 


with an error less than 0.005. 


Equal Partitions 


If f : [a, b] — R is continuous, we know that its Riemann integral exists. To find an 
approximate value for this integral with the minimum amount of calculation, it is 
convenient to consider partitions P,, of [a,b] into n equal subintervals having length 
hy, := (b — a)/n. Hence P, is the partition: 


a<ath,<at2hy <- <a+nh, =b. 


If we pick our tag points to be the left endpoints and the right endpoints of the subintervals, 
we obtain the nth left approximation given by 
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and the nth right approximation given by 
R (f) = hn Sf (a+ kh). 
k=1 


It should be noted that it is almost as easy to evaluate both of these approximations as only 
one of them, since they differ only by the terms f(a) and fb). 

Unless we have reason to believe that one of L,(f) or R,(f) is closer to the actual value 
of the integral than the other one, we generally take their mean: 


1 
5 (Inf) + Rul f)); 
which is readily seen to equal 
1 4 1 
(1) PAA = mh 30 + Sofa k) rO); 
k=1 


as a reasonable approximation to f? oh: 
However, we note that if fis increasing on [a, b], then it is clear from a sketch of the 
graph of f that 


b 
(2) in(f) < ff < Ralf) 


In this case, we readily see that 


b 
[ttt] FRA- LA) 
= Eh (fO) —f(a)) = (FO) -Fla)). 


An error estimate such as this is useful, since it gives an upper bound for the error of the 
approximation in terms of quantities that are known at the outset. In particular, it can be 
used to determine how large we should choose n in order to have an approximation that will 
be correct to within a specified error ¢ > 0. 

The above discussion was valid for the case that f is increasing on [a, b]. If f is 
decreasing, then the inequalities in (2) should be reversed. We can summarize both cases in 
the following statement. 


7.5.1 Theorem [ff : |a, b] — R is monotone and if T (f) is given by (1), then 


b =a 


(3) 


7.5.2 Example If f(x) := e~* on [0, 1], then f is decreasing. It follows from (3) that 
if n=8, then |f e- dx —Ts(f)| < (1—e7!)/16 < 0.04, and if n=16, then 
(ie e=} dx — Ty(f)| < (1 — e~!)/32 < 0.02. Actually, the approximation is consider- 
ably better, as we will see in Example 7.5.5. 


The Trapezoidal Rule 


The method of numerical integration called the ““Trapezoidal Rule” is based on approxi- 
mating the continuous function f : [a, b] — R by a piecewise linear continuous function. 
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Let n € N and, as before, let h, := (b — a)/n and consider the partition P,,. We approxi- 
mate f by the piecewise linear function g, that passes through the points 
(a+khy,f(a+khy)), where k = 0, 1,...,n. It seems reasonable that the integral f? f 
will be “approximately equal to” the eral J g, When n is sufficiently large (provided 
that f is reasonably smooth). 

Since the area of a trapezoid with horizontal base A and vertical sides /; and J2 is known 
to be 5A(/, + h2), we have 


(k+1)h 1 
J En = 5 hin: [f(a + khn) +f(a + (k+ 1)/n)], 
atkly 2 

for k = 0, 1,...,n — 1. Summing these terms and noting that each partition point in P,, 
except a and b belongs to two adjacent subintervals, we obtain 


i 1 1 
[=h GOE hy) tslat (k= D +O): 


But the term on the right is precisely T, (f), found in (1) as the mean of L, (f) and R,(f). 
We call T,,(f) the nth Trapezoidal Approximation of f. 

In Theorem 7.5.1 we obtained an error estimate in the case where f is monotone; we 
now state one without this restriction on f, but in terms of the second derivative f” of f. 


7.5.3 Theorem Let f, f' and f" be continuous on [a, b] and let T (f) be the nth 
Trapezoidal Approximation (1). Then there exists c € |a, b| such that 


b ak 
(4 n- f t= pro. 


A proof of this result will be given in Appendix D; it depends on a number of results we 
have obtained in Chapters 5 and 6. 

The equality (4) is peas: in that it can give both an upper bound and a lower bound 
for the difference T,( - f? f. For example, if f"(x) > A > 0 for all x € [a, b], then (4) 
implies that this ee always exceeds 7,A(b — a)h2. If we only have f”(x) > 0 for 
x € [a, b], which is the case when fis conver (= concave upward), then the Trapezoidal 
Approximation is always too large. The reader should draw a figure to visualize this. 

However, it is usually the upper bound that is of greater interest. 


7.5.4 Corollary Let f, f’, and f" be continuous, and let | f"(x)| < B2 for all x € [a, b). 
Then 


(5) 


When an upper bound Bz can be found, (5) can be used to determine how large n must 
be chosen in order to be certain of a desired accuracy. 


7.5.5 Example If f(x):=e~* on [0, 1], then a calculation shows that f”(x) = 
Qe (2x? — 1), so that we can take B2 = 2. Thus, if n = 8, then 


1 
Í 0.003. 
(/) - fis = 12.64 “giz 384 ~ 
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On the other hand, if n = 16, then we have 


—~_1_ < 0.00066. 


T 
ruts) - [is 12. 556 1536 


Thus, the accuracy in this case is considerably better than predicted in Example 
12s 


The Midpoint Rule 


One obvious method of approximating the integral of f is to take the Riemann sums 
evaluated at the midpoints of the subintervals. Thus, if P,, is the equally spaced partition 
given before, the Midpoint Approximation of f is given by 


(6) M,(f) := In( s(a) +t(a+ 3h) tts (aln -5) hn) 
=m s(a (i -3)hn). 


Another method might be to use piecewise linear functions that are tangent to the 
graph of f at the midpoints of these subintervals. At first glance, it seems as if we would 
need to know the slope of the tangent line to the graph of f at each of the midpoints 
a+ (k — thn) (k = 1, 2,...,m). However, it is an exercise in geometry to show that the 
area of the trapezoid whose top is this tangent line at the midpoint a + (k — 5) hy is equal to 
the area of the rectangle whose height is f(a + (k = 5) In). (See Figure 7.5.1.) Thus, this 
area is given by (6), and the “Tangent Trapezoid Rule” turns out to be the same as the 
“Midpoint Rule.” We now state a theorem showing that the Midpoint Rule gives better 
accuracy than the Trapezoidal Rule by a factor of 2. 


7.5.6 Theorem Let f, f’, and f" be continuous on [a, b] and let M,(f) be the nth 
Midpoint Approximation (6). Then there exists y € |a, b| such that 


0) f t-m = 950" p00. 


The proof of this result is in Appendix D. 


1 
fla + (k- zH 


a + (k-— 1)k 


1 
a+(k-—>)h 


Figure 7.5.1 The tangent trapezoid 
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As in the case with Theorem 7.5.3, formula (7) can be used to give both an upper bound 
and a lower bound for the difference f? f —M,(f), although it is an upper bound that is 
usually of greater interest. In contrast with the Trapezoidal Rule, if the function is convex, 
then the Midpoint Approximation is always too small. 

The next result is parallel to Corollary 7.5.4. 


7.5.7 Corollary Let f, f’, and f" be continuous, and let | f"(x)| < B2 for all x € |a, b]. 
Then 


(8) b- fr 


Simpson’s Rule 


The final approximation procedure that we will consider usually gives a better approxi- 
mation than either the Trapezoidal or the Midpoint Rule and requires essentially no extra 
calculation. However, the convexity (or the concavity) of f does not give any information 
about the error for this method. 

Whereas the Trapezoidal and Midpoint Rules were based on the approximation of fby 
piecewise linear functions, Simpson’s Rule approximates the graph of f by parabolic arcs. 
To help motivate the formula, the reader may show that if three points 


(—h, Yo), (0, yı), and (h, y2) 


are given, then the quadratic function q(x) := Ax? + Bx + C that passes through these 
points has the property that 


h 
1 
I q= 3 (% + 4y, + yo). 


Now let f be a continuous function on [a, b] and let n€ N be even, and let 
hy, := (b — a)/n. On each “double subinterval”’ 
[a, a+ 2h], [a + 2h, a + 4hy],...,[b — 2hn, b], 
we approximate f by 1/2 quadratic functions that agree with f at the points 
yo =F), yıi=fla +h), yz:=f(a + 2hr), ..-; Yn = f(b). 


These considerations lead to the nth Simpson Approximation, defined by 
1 
Sa( f) := gin (f(a) + 4f (at hn) + 2f (a+ 2h) +4 f(a + 3hr) 


+2f (a+ dn) +--+ 2f(b— 2hn) +4 f(b — hn) + f(b). 


Note that the coefficients of the values of fat the n + 1 partition points follow the pattern 
1, 4, 2, 4, 2,...,4, 2, 4, 1. 

We now state a theorem that gives an estimate about the accuracy of the Simpson 
Approximation; it involves the fourth derivative of f. 


(9) 


7.5.8 Theorem Let f, f’,f",f©, andf be continuous on [a, b] and letn € N be even. If 
Si(f) is the nth Simpson Approximation (9), then there exists c € |a, b| such that 


(b—a)hy, 


b 
(10) S- f r= RAO 


A proof of this result is given in Appendix D. 
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The next result is parallel to Corollaries 7.5.4 and 7.5.7. 


7.5.9 Corollary Let f, f’,f", f°, andf™ be continuous on [a, b] and let ®© (x)| < By 
for all x € |a, b]. Then 


(11) 


Successful use of the estimate (11) depends on being able to find an upper bound for 
the fourth derivative. 


2 


7.5.10 Example If f(x) :=4e~* on [0, 1] then a calculation shows that 
FO) = 4e-™ (4x4 — 12? + 3), 


whence it follows that| f (4) (x) | < 20 for x € [0, 1], so we can take B4 = 20. It follows from 
(11) that if n = 8 then 


T 180- =F 36, 864 


1 
Ss(f) - fas -20 = < 0.000 03 


and that if n = 16 then 


T 589,824 wi 


Si6(f) - fis < 0.000 001 7. 


Remark The th Midpoint Approximation M,,( f) can be used to “step up” to the (27)th 
Trapezoidal and Simpson Approximations by using the formulas 


1 


Tan(f) = 5Malf) +5 %a(f) and Salf) = Gulf) +5 Taf) 


that are given in the exercises. Thus once the initial Trapezoidal Approximation Tı = 
Tı( f) has been calculated, only the Midpoint Approximations M„ = M,,(f) need be found. 
That is, we employ the following sequence of calculations: 


T, = ©- (f(a) +S); 
M,=(b-a)f(S(a+5)),  T2=FM +57, = 2M +57; 
2 2 3 3 
Mhn, T4 = iM + iT, S4 = ZM: + iT; 
Ma, Tg = iM, + 314, Sg = = Ms + $745 


Exercises for Section 7.5 


ie} 


1. Use the Trapezoidal Approximation with n = 4 to evaluate In2 = f a /x)dx. Show that 


0.6866 < In2 < 0.6958 and that 


1 1 
0.0013 < — < T, — In2 < — < 0.0105. 
768 © 47 SOG 
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Use the Simpson Approximation with n = 4 to evaluate In2 = f > (1/x)dx. Show that 0.6927 < 
In 2 < 0.6933 and that 


1 1 1 
0.000 01 - < In2 .000 521. 
000 0 PSs 1920 = * n i920 < © 05 


Let f(x) := (1 +2) for x € [0, 1]. Show that f”(x) = 2(3x? — 1)(1 +2 and that 


|f"(x)| < 2 for x € [0, 1]. Use the Trapezoidal Approximation with n = 4 to evaluate 7/4 = 
So F( x)dx. Show that |T4(f) — (2/4)| < 1/96 < 0.0105. 


If the Trapezoidal Approximation T,,(f) is used to approximate 7/4 as in Exercise 3, show that 
we must take n > 409 in order to be sure that the error is less than 1076. 


Let i be as in Exercise 3. Show that f(x) = 24(5x4 — 10x? + 1)(1 4 x?) and that 
| f% (x)| < 96 for x € [0, 1]. Use Simpson’s Approximation with n = 4 to evaluate 7/4. 
Show that |S4(f) — (/4)| < 1/480 < 0.0021. 


If the Simpson Approximation S,,(f) is used to approximate 7/4 as in Exercise 5, show that we 
must take n > 28 in order to be sure that the error is less than 10~°. 


If p is a polynomial of degree at most 3, show that the Simpson Approximations are exact. 
Show that if f”(x) > 0 on [a, b] (that is, if fis convex on [a, b]), then for any natural numbers 
m,n we have M,(f) < f? f(x)dx < Tm( f). If f” (x) < 0 on [a, b], this inequality is reversed. 
Show that Ta, ( f) = }[M, (f) + Ta (P). 

Show that Son ( f) = 4M, (f) +47 n - 


Show that one has the estimate |S 
for all x € |a, b]. 


~ [2 Flx)ds| < [(b — a)? /18r?] Bs, where By > |") 


Note that f (1-— x2" *dx = zt/4. Explain why the error estimates given by formulas (4), (7), 
and (10) cannot be used. Show that if A(x) =(1—-— 2 for x in [0, 1], then 
T,(h) < 2/4 <M,,(h). Calculate Mg(h) and Tg(A). 

If h is as in Exercise 12, explain why K := SiP ha \dx = 1/8 + 1/4. Show that |4” (x)| < 
23/2 and that a x)| < 9- 27/2 for x € [0, 1/2]. Show that |K — T,(h)| < 1/127? and that 
|K — S„(h)| < 1/10n*. Use these results to calculate 7. 


In Exercises 14-20, approximate the indicated integrals, giving estimates for the error. Use a 
calculator to obtain a high degree of precision. 


14. 


2 2 
4\ 1/2 341/2 
fate) dx. 15. | (4+x°) “dx. 16. [% 
Tos /2 
[ewe 18. fi E wf vsin x dx. 
0 0 0 


1+ sinx 


CHAPTER 8 


SEQUENCES OF FUNCTIONS 


In previous chapters we have often made use of sequences of real numbers. In this chapter 
we shall consider sequences whose terms are functions rather than real numbers. 
Sequences of functions arise naturally in real analysis and are especially useful in obtaining 
approximations to a given function and defining new functions from known ones. 

In Section 8.1 we will introduce two different notions of convergence for a sequence of 
functions: pointwise convergence and uniform convergence. The latter type of convergence 
is very important, and will be the main focus of our attention. The reason for this focus is 
the fact that, as is shown in Section 8.2, uniform convergence “preserves” certain 
properties in the sense that if each term of a uniformly convergent sequence of functions 
possesses these properties, then the limit function also possesses the properties. 

In Section 8.3 we will apply the concept of uniform convergence to define and derive 
the basic properties of the exponential and logarithmic functions. Section 8.4 is devoted to 
a similar treatment of the trigonometric functions. 


Section 8.1 Pointwise and Uniform Convergence 


Let A C R be given and suppose that for each n € N there is a function f,, : A — R; we 
shall say that (f,,) is a sequence of functions on A to R. Clearly, for each x € A, such a 
sequence gives rise to a sequence of real numbers, namely the sequence 


(1) (Fa), 


obtained by evaluating each of the functions at the point x. For certain values of x € A the 
sequence (1) may converge, and for other values of x € A this sequence may diverge. For 
each x € A for which the sequence (1) converges, there is a uniquely determined real 
number lim(/,,(x)). In general, the value of this limit, when it exists, will depend on the 
choice of the point x € A. Thus, there arises in this way a function whose domain consists 
of all numbers x € A for which the sequence (1) converges. 


8.1.1 Definition Let (f,,) be a sequence of functions on A C R to R, let Ao C A, and let 
f : Ao — R. We say that the sequence (f,,) converges on Ao to f if, for each x € Ao, the 
sequence (f,,(x)) converges to f(x) in R. In this case we call f the limit on Ag of the 
sequence (/,,). When such a function f exists, we say that the sequence (f„) is convergent 
on Ao, or that (f,,) converges pointwise on Ao. 


It follows from Theorem 3.1.4 that, except for a possible modification of the domain 
Ag, the limit function is uniquely determined. Ordinarily we choose Ag to be the largest set 
possible; that is, we take Ao to be the set of all x € A for which the sequence (1) is 
convergent in R. 


241 


242 CHAPTER 8 SEQUENCES OF FUNCTIONS 
In order to symbolize that the sequence (f,,) converges on Ag to f, we sometimes 


write 


f=lim(f,) on Ao, or f,—>f on Apo. 
Sometimes, when f,, and f are given by formulas, we write 


f(x) =limf,,(x) for x€Ao, or f,(x) > f(x) for x € Ao. 


8.1.2 Examples (a) lim(x/n) = 0 for x € R. 
For n € N, let f,,(x) := x/n and let f(x) := 0 for x € R. By Example 3.1.6(a), we 
have lim(1/n) = 0. Hence it follows from Theorem 3.2.3 that 


lim(f,,(x)) = lim(x/n) = x lim(1/n) = x-0=0 


for all x € R. (See Figure 8.1.1.) 


(1,g(1)) 


Figure 8.1.1 f(x) = x/n Figure 8.1.2 ¢,(x) = x” 


(b) lim(x”). 

Let g,(x) := x" for x € R,n € N. (See Figure 8.1.2.) Clearly, if x = 1, then the 
sequence (g,(1)) = (1) converges to 1. It follows from Example 3.1.11(b) that lim(x”) = 0 
for 0 < x < 1 and it is readily seen that this is also true for —1 < x < 0. If x = —1, then 
g,(—1) = (—1)”, and it was seen in Example 3.2.8(b) that the sequence is divergent. 
Similarly, if |x| > 1, then the sequence (x”) is not bounded, and so it is not convergent in R. 
We conclude that if 


0 for —-l<x<l, 
1 for x=1, 


then the sequence (g,,) converges to g on the set (—1, 1]. 
(© lim((x? + nx)/n) = x for x € R. 

Let h(x): = (x? +nx)/n for x ER, neN, and let A(x):= x for x ER (See 
Figure 8.1.3.) Since we have h(x) = (x?/n) + x, it follows from Example 3.1.6(a) 
and Theorem 3.2.3 that A (x) > x = h(x) for all x € R. 
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hy h2 h, h 


Figure 8.1.3 A, (x) = Q + nx) /n Figure 8.1.4 F,,(x) = sin(nx + n)/n 


(d) lim((1/n) sin(nx + n)) = 0 for x € R. 
Let F (x) := (1/n) sin(nx + n) for x € R, n € N, and let F(x) := 0 for x € R. (See 
Figure 8.1.4.) Since |sin y| < 1 for all y € R we have 


(2) | Fn(x) — F(x)| = < 


ale 


Ls: 
= sin(nx + n) 


for all x € R. Therefore it follows that lim(F,,(x)) = 0 = F(x) for all x € R. The reader 
should note that, given any € > 0, if n is sufficiently large, then |F,(x) — F(x)| < e for all 
values of x simultaneously! 


Partly to reinforce Definition 8.1.1 and partly to prepare the way for the important 
notion of uniform convergence, we reformulate Definition 8.1.1 as follows. 


8.1.3 Lemma A sequence (f,) of functions on ACR to R converges to a function 
f : Ao > Ron Ao if and only if for each e > 0 and each x € Ag there is a natural number 
K(e, x) such that ifn > K(e, x), then 


(3) Ifn(x) — F(x) < e. 


We leave it to the reader to show that this is equivalent to Definition 8.1.1. We wish to 
emphasize that the value of K (e, x) will depend, in general, on both € > 0 and x € Ao. The 
reader should confirm the fact that in Examples 8.1.2(a-c), the value of K (e, x) required to 
obtain an inequality such as (3) does depend on both ¢ > 0 and x € Ao. The intuitive reason 
for this is that the convergence of the sequence is “significantly faster” at some points than 
itis at others. However, in Example 8.1.2(d), as we have seen in inequality (2), if we choose 
n sufficiently large, we can make |F,(x) — F(x)| < e for all values of x € R. It is precisely 
this rather subtle difference that distinguishes between the notion of the “pointwise 
convergence” of a sequence of functions (as defined in Definition 8.1.1) and the notion of 
“uniform convergence.” 


Uniform Convergence 


8.1.4 Definition A sequence (f,,) of functions on A C R to R converges uniformly on 
Ao CA to a function f : Ao — R if for each ¢ > 0 there is a natural number K(e) 
(depending on € but not on x € Ao) such that if n > K(e), then 


(4) Ifu(x) fE) Ke forall x € Ao. 
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In this case we say that the sequence (f,,) is uniformly convergent on Ag. Sometimes we 
write 


fn =f on Ao, or =f, (x) > f(x) for x €Apo. 


It is an immediate consequence of the definitions that if the sequence (f,,) is uniformly 
convergent on Ap to f, then this sequence also converges pointwise on Ag to fin the sense of 
Definition 8.1.1. That the converse is not always true is seen by a careful examination of 
Examples 8.1.2(a—c); other examples will be given below. 

It is sometimes useful to have the following necessary and sufficient condition for a 
sequence (fy) to fail to converge uniformly on Ag to f. 


8.1.5 Lemma A sequence (f,,) of functions on A C R to R does not converge uniformly 
on Ao C A to a function f : Ag — R if and only if for some & > 0 there is a subsequence 
(Fm) Of (fn) and a sequence (xx) in Ao such that 


(5) | Fag (Xk) — f(xx)| > & for all KEN. 


The proof of this result requires only that the reader negate Definition 8.1.4; we 
leave this to the reader as an important exercise. We now show how this result can be 
used. 


8.1.6 Examples (a) Consider Example 8.1.2(a). If we let np := k and x, := k, then 
fn (Xk) = 1 so that [Sn (xx) — f(xx)| = |1 — 0| = 1. Therefore the sequence (fọ) does not 
converge uniformly on R to f. 


(b) Consider Example 8.1.2(b). If ng := k and x; := ORE fies 


Therefore the sequence (g,) does not converge uniformly on (—1, 1] to g. 
(c) Consider Example 8.1.2(c). If ny := k and x, := —k, then My, (xx) = 0 and (xp) = 
—k so that |n, (xx) — h(x~)| = k. Therefore the sequence (/,,) does not converge uni- 
formly on R to A. 


The Uniform Norm 


In discussing uniform convergence, it is often convenient to use the notion of the uniform 
norm on a set of bounded functions. 


8.1.7 Definition If A C R and g: A — R is a function, we say that gy is bounded on 
A if the set (A) is a bounded subset of R. If g is bounded we define the uniform norm of 
gy on A by 


(6) Ilolla = sup{|o(x)| : x € A}. 
Note that it follows that if € > 0, then 


(7) Ilolla <é = |p(x)| <€ for all x€A. 


8.1.8 Lemma A sequence (f,,) of bounded functions on A C R converges uniformly on A 
to f if and only if || fn — flla > 0. 
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Proof. (=) If (fn) converges uniformly on A to f, then by Definition 8.1.4, given any € > 0 
there exists K(e) such that if n > K(e) and x € A then 


| fn() — f(x)| <6. 


From the definition of supremum, it follows that || f,, —f||, < € whenever n > K(e). Since 
e > 0 is arbitrary this implies that || f,, —f||, — 0. 

(=) If || fa —fl|4 — 0, then given ¢ > 0 there is a natural number H(e) such that if 
n > H(e) then ||f,, —f||, < £. It follows from (7) that | f (x) — f(x)| < e for all n > H(e) 
and x € A. Therefore (f,,) converges uniformly on A to f. QED. 


We now illustrate the use of Lemma 8.1.8 as a tool in examining a sequence of 
bounded functions for uniform convergence. 


8.1.9 Examples (a) We cannot apply Lemma 8.1.8 to the sequence in Example 8.1.2(a) 
since the function f,,(x) — f(x) = x/n is not bounded on R. 

For the sake of illustration, let A := [0,1]. Although the sequence (x/n) did not 
converge uniformly on R to the zero function, we shall show that the convergence is 
uniform on A. To see this, we observe that 


1 
lfa — Fla = supt|x/n — 0] :0<x<1}=7 


so that || f, —f||, — 0. Therefore (f,) is uniformly convergent on A to f. 
(b) Let g,(x) := x" for x € A := [0,1] and n € N, and let g(x) :=0 for 0 < x < 1 and 
g(1) := 1. The functions g,,(x) — g(x) are bounded on A and 


mee ee x for O<x<1l —] 
En — Bila = SUP q0 for x=1 E 


for any n € N. Since || g, — g||, does not converge to 0, we infer that the sequence (g,,) 
does not converge uniformly on A to g. 
(c) We cannot apply Lemma 8.1.8 to the sequence in Example 8.1.2(c) since the function 
h,(x) — h(x) = x?/n is not bounded on R. 

Instead, let A := [0, 8] and consider 


|| An — All, = sup{x’/n: 0 <x < 8} = 64/n. 


Therefore, the sequence (/,,) converges uniformly on A to A. 

(d) If we refer to Example 8.1.2(d), we see from (2) that || Fn — F||p < 1/n. Hence (F,,) 
converges uniformly on R to F. 

(e) Let G(x) := x"(1 — x) for x € A := [0, 1]. Then the sequence (G,(x)) converges to 
G(x) := 0 for each x € A. To calculate the uniform norm of G, — G = G, on A, we find the 
derivative and solve 


Gi (x) =x"! (n — (n+ 1)x) =0 
to obtain the point x, := n/(n + 1). This is an interior point of [0, 1], and it is easily 


verified by using the First Derivative Test 6.2.8 that G, attains a maximum on [0, 1] at x,. 
Therefore, we obtain 


| Gall, = Gi(Xn) = (1 + 1/n)” TTE 


which converges to (1/e) -0 = 0. Thus we see that convergence is uniform on A. 
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By making use of the uniform norm, we can obtain a necessary and sufficient condition 
for uniform convergence that is often useful. 


8.1.10 Cauchy Criterion for Uniform Convergence Let (fn) be a sequence of bounded 
functions on A C R. Then this sequence converges uniformly on A to a bounded function f 
if and only if for each e > 0 there is a number H(e) in N such that for allm,n > H(e), then 


Lee = falla < E. 


Proof. (=) lff,, = f on A, then given e > 0 there exists a natural number K G £) such 
that if n > K (4e) then ||f,, —f||, < $2. Hence, if both m,n > K (4e), then we conclude 
that 


Fn) — fa S Fm) OA + fn) SOEI e 


for all x € A. Therefore || fn — falla < £ for m,n > K(4e) =: H(e). 
(<) Conversely, suppose that for ¢ > 0 there is H(e) such that if m,n > H(e), then 
lfm —fnlla < £. Therefore, for each x € A we have 


(8) Sm — ful) S llfm —falla S& for m,n > H(e). 


It follows that (f,(x)) is a Cauchy sequence in R; therefore, by Theorem 3.5.5, it is a 
convergent sequence. We define f : A — R by 


f(x) := lim(f,,(x)) for xEA. 
If we let n — oo in (8), it follows from Theorem 3.2.6 that for each x € A we have 
fa) —f)|<e for m> H(e). 


Therefore the sequence (f,,) converges uniformly on A to f. Q.E.D. 


Exercises for Section 8.1 


1. Show that lim(x/(x + n)) = 0 for all x € R, x > 0. 

2. Show that lim(nx/( + n°x*)) = 0 for all x € R. 

3. Evaluate lim(nx/(1 + nx)) for x € R, x > 0. 

4. Evaluate lim(x”/(1 + x”)) for x € R, x > 0. 

5. Evaluate lim((sin nx)/(1 + nx)) for x € R, x > 0. 

6. Show that lim(Arctan nx) = (7/2)sgn x for x € R. 

7. Evaluate lim(e~”*) for x € R, x > 0. 

8. Show that lim(xe™”*) = 0 for x € R, x > 0. 

9. Show that lim(x?e~”*) = 0 and that lim(n?x?e~"*) = 0 for x € R, x > 0. 
10. Show that lim((cos x)*”) exists for all x € R. What is its limit? 


11. Show that if a > 0, then the convergence of the sequence in Exercise 1 is uniform on the interval 
[0, a], but is not uniform on the interval [0, 00). 


12. Show that if a > 0, then the convergence of the sequence in Exercise 2 is uniform on the interval 
[a, oo), but is not uniform on the interval [0, 00). 


13. Show that if a > 0, then the convergence of the sequence in Exercise 3 is uniform on the interval 
[a, oo), but is not uniform on the interval [0, 00). 
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14. Show that if 0 < b < 1, then the convergence of the sequence in Exercise 4 is uniform on the 
interval [0, b], but is not uniform on the interval [0, 1]. 


15. Show that if a > 0, then the convergence of the sequence in Exercise 5 is uniform on the interval 
[a, oo), but is not uniform on the interval [0, oo). 


16. Show that if a > 0, then the convergence of the sequence in Exercise 6 is uniform on the interval 
[a, co), but is not uniform on the interval (0, 00). 


17. Show that if a > 0, then the convergence of the sequence in Exercise 7 is uniform on the interval 
[a, co), but is not uniform on the interval [0, 00). 


18. Show that the convergence of the sequence in Exercise 8 is uniform on [0, 00). 
19. Show that the sequence (x?e~”*) converges uniformly on [0, o0). 


20. Show that if a > 0, then the sequence (n*x?e~”*) converges uniformly on the interval [a, 00), but 
that it does not converge uniformly on the interval [0, oo). 


21. Show that if (fn), (g,) converge uniformly on the set A to f, g, respectively, then (fan + gn) 
converges uniformly on A to f + g. 


22. Show that iff,,(x) := x +1/nandf(x) := x for x € R, then (fp) converges uniformly on R tof, 
but the sequence ( Fa) does not converge uniformly on R. (Thus the product of uniformly 
convergent sequences of functions may not converge uniformly.) 


23. Let (fa), (gn) be sequences of bounded functions on A that converge uniformly on A to f, g, 
respectively. Show that (f,,g,,) converges uniformly on A to fg. 


24. Let (f,,) be a sequence of functions that converges uniformly to fon A and that satisfies | f,,(x)| < 
M for alln € N and all x € A. If g is continuous on the interval [—M, M], show that the sequence 
(g © fa) converges uniformly to g o f on A. 


Section 8.2 Interchange of Limits 


It is often useful to know whether the limit of a sequence of functions is a continuous 
function, a differentiable function, or a Riemann integrable function. Unfortunately, it is 
not always the case that the limit of a sequence of functions possesses these useful 
properties. 


8.2.1 Examples (a) Letg,(x) := x" for x € [0, 1] andn € N. Then, as we have noted in 
Example 8.1.2(b), the sequence (g,,) converges pointwise to the function 


0 for O<x<l, 
1 for x=1. 


Although all of the functions g,, are continuous at x = 1, the limit function g is not 
continuous at x = 1. Recall that it was shown in Example 8.1.6(b) that this sequence does 
not converge uniformly to g on [0, 1]. 
(b) Each of the functions g,(x) = x” in part (a) has a continuous derivative on [0,1]. 
However, the limit function g does not have a derivative at x = 1, since it is not continuous 
at that point. 
(c) Let f, : [0,1] — R be defined for n > 2 by 
nx for O<x<I1/n, 

falx) = < —n?(x — 2/n) for 1/n< x< 2/n, 

0 for 2/n<x<l. 
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(See Figure 8.2.1.) It is clear that each of the functions fọ, is continuous on [0, 1]; hence it is 
Riemann integrable. Either by means of a direct calculation, or by referring to the 
significance of the integral as an area, we obtain 


1 
f fr(xj\dx=1 for n>2. 
0 


The reader may show that f,,(x) — 0 for all x € [0, 1]; hence the limit function f vanishes 
identically and is continuous (and hence integrable), and i f(x)dx = 0. Therefore we 
have the uncomfortable situation that: 


1 1 
f(x)dx = 0 £ 1 = lim T fn(x)dx. 
0 0 


Figure 8.2.1 Example 8.2.1(c) 


(d) Those who consider the functions f, in part (c) to be “‘artificial”’ may prefer to consider 
the sequence (H,,) defined by /,,(x) := Qnxe-"™™ for x € (0, 1], n € N. Since h, = H'n, 
where H,,(x) := —e~”*’, the Fundamental Theorem 7.3.1 gives 


i hy, (x)dx = H,(1) — H,(0) =1-—e. 
0 


It is an exercise to show that A(x) := lim(A,(x)) = 0 for all x € [0, 1]; hence 


[ asz im [a(x 


Although the extent of the discontinuity of the limit function in Example 8.2.1 (a) is 
not very great, it is evident that more complicated examples can be constructed that will 
produce more extensive discontinuity. In any case, we must abandon the hope that the limit 
of a convergent sequence of continuous [respectively, differentiable, integrable] functions 
will be continuous [respectively, differentiable, integrable]. 

It will now be seen that the additional hypothesis of uniform convergence is sufficient 
to guarantee that the limit of a sequence of continuous functions is continuous. Similar 
results will also be established for sequences of differentiable and integrable functions. 


Interchange of Limit and Continuity 


8.2.2 Theorem Let (fn) be a sequence of continuous functions on a set A C R and sup- 
pose that (f,) converges uniformly on A to afunctionf : A — R. Then f is continuous on A. 
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Proof. By hypothesis, given € > 0 there exists a natural number H := H (4 e) such that if 
n > H then |f, (x) — f(x)| < 4e for all x € A. Let c € A be arbitrary; we will show that fis 
continuous at c. By the Triangle Inequality we have 


If) =OS IFE) =u + Fal) — fullt Fal) — FO) 
< 38+ |ful) fule) +38- 


Since fy is continuous at c, there exists a number ô := 8(4e,¢, fu) > 0 such that if 
|x — c| < ô and x € A, then |fy(x) —fy(c)| < $e. Therefore, if |x — c| < ô and x € A, 
then we have | f(x) —f(c)| < e. Since e > 0 is arbitrary, this establishes the continuity of f 
at the arbitrary point c € A. (See Figure 8.2.2.) QED. 


(x, fy (x)) 


(c, f (c)) 


(x, f (x)) 


(c, fy (c)) 


Figure 8.2.2 | f(x) —f(c)| <e 


Remark Although the uniform convergence of the sequence of continuous functions 
is sufficient to guarantee the continuity of the limit function, it is not necessary. 
(See Exercise 2.) 


Interchange of Limit and Derivative 


We mentioned in Section 6.1 that Weierstrass showed that the function defined by the series 
f(x) := 5 2-*cos (3*x) 
k=0 


is continuous at every point but does not have a derivative at any point in R. By considering 
the partial sums of this series, we obtain a sequence of functions (f,,) that possess a 
derivative at every point and are uniformly convergent to f. Thus, even though the sequence 
of differentiable functions (f,,) is uniformly convergent, it does not follow that the limit 
function is differentiable. (See Exercises 9 and 10.) 

We now show that if the sequence of derivatives ( f) is uniformly convergent, then all 
is well. If one adds the hypothesis that the derivatives are continuous, then it is possible to 
give a short proof, based on the integral. (See Exercise 11.) However, if the derivatives are 
not assumed to be continuous, a somewhat more delicate argument is required. 


8.2.3 Theorem Let J C R be a bounded interval and let (f,) be a sequence of functions 
on J to R. Suppose that there exists xo € J such that (f,,(x0)) converges, and that the 
sequence ( Fha) of derivatives exists on J and converges uniformly on J to a function g. 

Then the sequence (f,) converges uniformly on J to a function f that has a derivative at 
every point of J and f' = g. 
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Proof. Leta < bbe the endpoints of J and let x € J be arbitrary. If m,n € N, we apply the 
Mean Value Theorem 6.2.4 to the difference f — fn on the interval with endpoints Xo, x. 
We conclude that there exists a point y (depending on m, n) such that 


Fm) — ful) = fm x0) — fn(X0) + (% — x0) {fm(y) — faly)}- 


Hence we have 
(1) lfm = Fally < |Fm(%0) = fno) + (b — @)|| fn = fally- 


From Theorem 8.1.10, it follows from (1) and the hypotheses that ( f„(xo)) is convergent 
and that (f;) is uniformly convergent on J, that (fp) is uniformly convergent on J. We 
denote the limit of the sequence (f,,) by f Since the f, are all continuous and the 
convergence is uniform, it follows from Theorem 8.2.2 that f is continuous on J. 

To establish the existence of the derivative of f at a point c € J, we apply the Mean 
Value Theorem 6.2.4 to f,,, — fn on an interval with end points c, x. We conclude that there 
exists a point z (depending on m, n) such that 


(fm) — Ful) } = {Fm(©) — fnlo)} =  — ©) { Fm(2) — fn(Z)}- 


Hence, if x Æ c, we have 


Ím (x) = fink) finlX) Sate) 


xXx—C X= 


< [lfm — fally 


Since ( fọ) converges uniformly on J, if € > 0 is given there exists H(e) such that if m,n > 
H(e) and x # c, then 


Fil) = Fh) falx) — fn (c) < 


X= C xX—C 


(2) 


If we take the limit in (2) with respect to m and use Theorem 3.2.6, we have 


ene FQ) _ fala) = filo) 


X= C X= € 


<e. 


provided that x # c, n > H(e). Since g(c) = lim( f! (c)), there exists N(e) such that if n > 
Me), then |fn(c) — g(c)| < e. Now let K := sup{H (e), N(e)}. Since fk(c) exists, there 
exists 5x() > 0 such that if 0 < |x — c| < ôg(e), then 


x) —fr(e 
fx Fe(9 _ (6) <e. 
Combining these inequalities, we conclude that if 0 < |x — c| < 5x(e), then 
f(x) =f 
E 36. 
ORF g(a) <3e 


Since ¢ > 0 is arbitrary, this shows that f’(c) exists and equals g(c). Since c € J is arbitrary, 
we conclude that f’= g on J. QED. 


Interchange of Limit and Integral 


We have seen in Example 8.2.1(c) that if (fp) is a sequence Ra, b] that converges on [a, b] 
to a function f in 7[a, b], then it need not happen a 


(3) ra f= lim Ee 


n—-0o 
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We will now show that uniform convergence of the sequence is sufficient to guarantee that 
this equality holds. 


8.2.4 Theorem Let (fa) be a sequence of functions in R{a,b| and suppose that (fn) 
converges uniformly on [a, b] to f. Then f € R{a, b] and (3) holds. 


Proof. It follows from the Cauchy Criterion 8.1.10 that given e > 0 there exists H(¢) such 
that if m > n > H(s) then 


—é <fi,(x) —f,(x) <€ for x€ [a,b]. 


Theorem 7.1.5 implies that 


~e(b — a) < f fa- f tsa) 


Since ¢ > 0 is arbitrary, the sequence (f , fm) is a Cauchy sequence in R and therefore 
converges to some number, say A € R. 

We now show f € Ra, b] with integral A. If ¢ > 0 is given, let K(e) be such that if m > 
K(e), then | fm (x) —f(x)| < e for all x € Ja, b]. If P := 4 ([xi1, xd, ti) }_, is any tagged 
partition of [a, b] and if m > K(e), then 


ISFni P) — SFP) 


^ 
v 
z 
< 
= 
3 
L 


A 
y 
— 
3 
| 
3 

| 
l 
Mm 
` 
a 
| 
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We now choose r > K(e) such that | PF, — A| < and we let 6,, > 0 be such that 
IJ? f, — S(f,;P)| < e whenever ||P|| < Be Then we have 


ISAP) -ALS ISP) = SUS PII + |S? a shelf ral 
< eb-—a)+e+e=6(b-a+2) 
But since ¢ > 0 is arbitrary, it follows that f € R[a, b] and SF =A. QED. 


The hypothesis of uniform convergence is a very stringent one and restricts the utility 
of this result. In Section 10.4 we will obtain some far-reaching generalizations of Theorem 
8.2.4. For the present, we will state a result that does not require the uniformity of the 
convergence, but does require that the limit function be Riemann integrable. The proof is 
omitted. 


8.2.5 Bounded Convergence Theorem Let (fọ) be a sequence in R|a, b] that converges 
on [a, b] to a function f € R|a,b] . Suppose also that there exists B > 0 such that 
\f,(x)| < B for all x € [a,b], n € N. Then equation (3) holds. 
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Dini’s Theorem 


We will end this section with a famous theorem due to Ulisse Dini (1845-1918) that gives a 
partial converse to Theorem 8.2.2 when the sequence is monotone. We will present a proof 
using nonconstant gauges (see Section 5.5). 


8.2.6 Dini’s Theorem Suppose that (f,,) is a monotone sequence of continuous functions 
on I := [a, b] that converges on I to a continuous function f. Then the convergence of the 
sequence is uniform. 


Proof. We suppose that the sequence (fn) is decreasing and let g,, := fi, — f. Then (g,,) isa 
decreasing sequence of continuous functions converging on / to the 0-function. We will 
show that the convergence is uniform on J. 

Given e > 0, ¢ € J, there exists m, € N such that 0 < g,,, (t) < ¢/2. Since gm,, is 
continuous at /, there exists 6,(7) > 0 such that 0 < g,,, (x) < e for all x € J satisfying 
|x — 1| < 6,(f). Thus, 5, is a gauge on J, and if P = {(Jj, t;)}"_, is a6, -fine partition, we set 
M; := MaX {Met - -< , Mer, }. If m > M, and x € I, then (by Lemma 5.5.3) there exists an 
index i with |x — ¢;| < 6,(t;) and hence 


O < 8m(X) S 8m, (x) < & 
Therefore, the sequence (gm) converges uniformly to the O-function. Q.E.D. 
It will be seen in the exercises that we cannot drop any one of the three hypotheses: 


(i) the functions f, are continuous, (ii) the limit function fis continuous, (iii) Z is a closed 
bounded interval. 


Exercises for Section 8.2 


1. Show that the sequence (x”/(1 + x”)) does not converge uniformly on [0, 2] by showing that the 
limit function is not continuous on [0, 2]. 


2. Prove that the sequence in Example 8.2.1(c) is an example of a sequence of continuous functions 
that converges nonuniformly to a continuous limit. 


3. Construct a sequence of functions on [0, 1] each of which is discontinuous at every point of [0, 1] 
and which converges uniformly to a function that is continuous at every point. 


4. Suppose (f,,) is a sequence of continuous functions on an interval J that converges uniformly on J 
to a function f. If (x,) C I converges to xo € J, show that lim(f,(x,)) = f (Xo). 


5. Letf:R — R be uniformly continuous on R and let f,(x) := f(x + 1/n) for x € R. Show that 
(fa) converges uniformly on R to f. 

6. Let f,(x) := 1/(1 + x)" for x € [0, 1]. Find the pointwise limit f of the sequence (f,,) on [0, 1]. 
Does (fa) converge uniformly to f on [0, 1]? 

7. Suppose the sequence (f,,) converges uniformly to f on the set A, and suppose that each fọ is 


bounded on A. (That is, for each n there is a constant M, such that | f (x)| < Mn for all x € A.) 
Show that the function f is bounded on A. 


8. Let f(x) = nx/( + nx”) for x € A := [0, co). Show that each f„ is bounded on A, 
but the pointwise limit fof the sequence is not bounded on A. Does (f,,) converge uniformly to 
fon A? 


10. 
11. 


12. 
13. 
14. 


15. 
16. 


17. 


18. 


19. 


20. 
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Let fa (x) := x"/n for x € [0, 1]. Show that the sequence (fọ) of differentiable functions 
converges uniformly to a differentiable function fon [0, 1], and that the sequence (/;,) converges 
on [0, 1] to a function g, but that g(1) 4 f’(1). 


Let g,(x) := e""*/n for x > 0,n € N. Examine the relation between lim(g,,) and lim( gh). 


Let J := [a, b] and let (fọ) be a sequence of functions on J — R that converges on / to f. Suppose 
that each derivative f} is continuous on J and that the sequence (f;,) is uniformly convergent to g 
on J. Prove that f(x) — f(a) = f° g(t)dt and that f'(x) = g(x) for all x € I. 


Show that lim ie er dx = 0, 

If a > 0, show that lim f7 (sin nx) /(nx)dx = 0. What happens if a = 0? 

Letf,,(x) := nx/(1 + nx) for x € [0, 1]. Show that (fn) converges nonuniformly to an integrable 
function f and that fo f(x)dx = lim fofn(xdx. 

Let g,,(x) := nx(1 — x)” for x € [0, 1], n € N. Discuss the convergence of (g,,) and Gis g,dx). 


Let {7}, F2, - . . , Fn . . - } be an enumeration of the rational numbers in Z := [0, 1], and let f, : 
I — R be defined to be 1 if x = r,,...,r, and equal to 0 otherwise. Show that f, is Riemann 
integrable for each n € N, that f4 (x) < f,(x) <--- <f,(x) <---, and that f(x) := lim(f,,(x)) 
is the Dirichlet function, which is not Riemann integrable on [0, 1]. 


Letf,(x) := 1 for x € (0,1/n) and f,(x) := 0 elsewhere in [0, 1]. Show that (fp) is a decreasing 
sequence of discontinuous functions that converges to a continuous limit function, but the 
convergence is not uniform on [0, 1]. 


Let f, (x) := x" for x € [0, 1], n € N. Show that ( f,) is a decreasing sequence of continuous func- 
tions that converges to a function that is not continuous, but the convergence is not uniform on [0, 1]. 


Let f,,(x) := x/n for x € [0,00), n € N. Show that (fn) is a decreasing sequence of continuous 
functions that converges to acontinuous limit function, but the convergenceis not uniform on [0, 00). 


Give an example of a decreasing sequence (f,,) of continuous functions on [0, 1) that converges 
to a continuous limit function, but the convergence is not uniform on [0, 1). 


Section 8.3 The Exponential and Logarithmic Functions 


We will now introduce the exponential and logarithmic functions and will derive some of 
their most important properties. In earlier sections of this book we assumed some 
familiarity with these functions for the purpose of discussing examples. However, it is 
necessary at some point to place these important functions on a firm foundation in order to 
establish their existence and determine their basic properties. We will do that here. There 
are several alternative approaches one can take to accomplish this goal. We will proceed by 
first proving the existence of a function that has itself as derivative. From this basic result, 
we obtain the main properties of the exponential function. The logarithm function is then 
introduced as the inverse of the exponential function, and this inverse relation is used to 
derive the properties of the logarithm function. 


The Exponential Function 


We begin by establishing the key existence result for the exponential function. 


8.3.1 Theorem There exists a function E : R — R such that: 


© E'(x) = E(x) forallx € R. 
(ii) E(O) = 1. 
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Proof. We inductively define a sequence (E,,) of continuous functions as follows: 


(1) E(x) == 1 +x, 
(2) Paoi | “E, (dt, 


for all n € N, x € R. Clearly E, is continuous on R and hence is integrable over any 
bounded interval. If E,, has been defined and is continuous on R, then it is integrable over 
any bounded interval, so that E„+ı is well-defined by the above formula. Moreover, it 
follows from the Fundamental Theorem (Second Form) 7.3.5 that E„+1 is differentiable at 
any point x € R and that 


(3) E (x)= E(x) for nen. 
An Induction argument (which we leave to the reader) shows that 


2. n 
AEN E for xeER. 


(4) En(x) =1t +5 nl 


Let A > 0 be given; then if |x| < A and m > n > 2A, we have 


xl x” 
5 Em —E, =) fea. a AT 
(5) En) -E= | yt + 
Att A m—n—1 
< 1 Bene Derren aie 
~ (n+ 1)! ager 5) 
AĄ”+! 
—— 2. 
Sai 


Since lim(A”/n!) = 0, it follows that the sequence (E,,) converges uniformly on the interval 
[—A, A] where A > 0 is arbitrary. In particular this means that (E,,(x)) converges for each 
x € R. We define E : R— R by 


E(x) := lim E,(x) for x ER. 


Since each x € R is contained inside some interval [—A, A], it follows from Theorem 8.2.2 
that E is continuous at x. Moreover, it is clear from (1) and (2) that E,(0) = 1 forall n € N. 
Therefore E(0) = 1, which proves (ii). 

On any interval [—A, A] we have the uniform convergence of the sequence (£,,). In 
view of (3), we also have the uniform convergence of the sequence (E’,) of derivatives. It 
therefore follows from Theorem 8.2.3 that the limit function E is differentiable on [—A, A] 
and that 


E(x) =lim(E,(x)) = lim(E,-1(x)) = E(x) 
for all x € [—A, A]. Since A > 0 is arbitrary, statement (i) is established. Q.E.D. 


8.3.2 Corollary The function E has a derivative of every order and E ® (x) = E(x) for all 
nEN,xER. 
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Proof. If n= 1, the statement is merely property (i). It follows for arbitrary n € N by 
Induction. Q.E.D. 


8.3.3 Corollary If x > 0, then 1 + x < E(x). 

Proof. Itis clear from (4) that if x > 0, then the sequence (E, (x)) is strictly increasing. 

Hence E(x) < E(x) for all x > 0. QED. 
It is next shown that the function £, whose existence was established in Theorem 8.3.1, 


is unique. 


8.3.4 Theorem The function E : R — R that satisfies (i) and (ii) of Theorem 8.3.1 is 
unique. 


Proof. Let E; and E, be two functions on R to R that satisfy properties (i) and (ii) of 
Theorem 8.3.1 and let F := E — E2. Then 


F'(x) = E (x) — E(x) = E1(x) — E(x) = F(x) 
for all x € R and 
F(0) = E,(0) — £,(0) = 1—-1=0. 


It is clear (by Induction) that F has derivatives of all orders and indeed that F™® (x) = F(x) 
fornEN, x ER. 

Let x € R be arbitrary, and let Z, be the closed interval with endpoints 0, x. Since F is 
continuous on /,, there exists K > 0 such that |F(t)| < K for all t € I. If we apply Taylor’s 
Theorem 6.4.1 to F on the interval 7, and use the fact that F O0) = F(0) = Oforallk € N, 
it follows that for each n € N there is a point c, € 1, such that 


F'(0) F”) n—-1 F” (cn) n 
— F(¢n) n 
© n i 
Therefore we have 
K 
|F(x)| < xl forall neN. 
n! 


But since lim(|x|/n!) = 0, we conclude that F(x) = 0. Since x € R is arbitrary, we infer that 
E\(x) — E(x) = F(x) = 0 for all x € R. Q.E.D. 


The standard terminology and notation for the function E (which we now know exists 
and is unique) is given in the following definition. 


8.3.5 Definition The unique function E : R — R, such that F'(x) = E(x) for all x € R 
and E(0) = 1, is called the exponential function. The number e := E(1) is called Euler’s 
number. We will frequently write 


exp(x) := E(x) or e := E(x) for x ER. 


The number e can be obtained as a limit, and thereby approximated, in several different 
ways. [See Exercises 1 and 10, and Example 3.3.6.] 

The use of the notation e* for E(x) is justified by property (v) in the next theorem, 
where it is noted that if r is a rational number, then E(r) and e” coincide. (Rational 
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exponents were discussed in Section 5.6.) Thus, the function E can be viewed as extending 
the idea of exponentiation from rational numbers to arbitrary real numbers. For a definition 
of a“ for a > 0 and arbitrary x € R, see Definition 8.3.10. 


8.3.6 Theorem The exponential function satisfies the following properties: 
(iii) E(x) A O forall x € R; 

(iv) E(x + y) = E(x)E(y) for all x,y € R; 

(v) E(r) =e" forallr € Q. 


Proof. (iii) Let a € R be such that E(œ) = 0, and let J, be the closed interval with 
endpoints 0, a. Let K > |E(t)| for all £ € Jy. Taylor’s Theorem 6.4.1 implies that for each 
n € N there exists a point c, € J, such that 


E'(a) 


1 = £(0) = E(@) TT 1! (a) aa 
(n) a Cn P 
+ O(a" = A ay" 


Thus we have 0< 1 < (K/n!)|a|" for ne N. But since lim(|a|"/n!) = 0, this is a 
contradiction. 


(iv) Let y be fixed; by (iii) we have E(y) 4 0. Let G : R — R be defined by 


E(x+y) 
E(y) 
Evidently we have G’(x) = E'(x + y)/E(y) = E(x + y)/E(y) = G(x) for all x € R, and 
G(0) = E(0+ y)/E(y) = 1. It follows from the uniqueness of E, proved in Theorem 8.3.4, 
that G(x) = E(x) for all x € R. Hence E(x + y) = E(x)E(y) for all x € R. Since y € R 

is arbitrary, we obtain (iv). 
(v) It follows from (iv) and Induction that if n € N, x € R, then 


E(nx) = E(x)". 


G(x) := for x ER. 


If we let x = 1/n, this relation implies that 


1 1 n 
e= B(I) =£(n->) = (EG)) 
n n 
whence it follows that E(1/n) = e!/". Also we have E(—m) = 1/E(m) = 1/e” = e™ for 


m € N. Therefore, if m € Z, n € N, we have 


E(m/n) = (E(1/n))" = (el/y" = e”. 
This establishes (v). Q.E.D. 


8.3.7 Theorem The exponential function E is strictly increasing on R and has range 
equal to {y E€ R : y > 0}. Further, we have 
(vi) lim E(x)=0 and lim E(x) = œ. 
Proof. We know that E(0) = 1 > 0 and E(x) ¥ 0 for all x € R. Since E is continuous on 


R, it follows from Bolzano’s Intermediate Value Theorem 5.3.7 that E(x) > O for all 
x € R. Therefore E'(x) = E(x) > 0 for x € R, so that E is strictly increasing on R. 
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It follows from Corollary 8.3.3 that 2 < e and that lim E(x) = oo. Also, if z > 0, then 
since 0 < E(—z) = 1/E(z) it follows that lim E(x) = 0. Therefore, by the Intermediate 
Value Theorem 5.3.7, every y € R with y > 0 belongs to the range of E. QED. 


The Logarithm Function 


We have seen that the exponential function E is a strictly increasing differentiable function 
with domain R and range {y € R: y > O}. (See Figure 8.3.1.) It follows that R has an 
inverse function. 


(0,1) (1,0) 


Figure 8.3.1 Graph of E Figure 8.3.2 Graph of L 


8.3.8 Definition The function inverse to E : R — R is called the logarithm (or the 
natural logarithm). (See Figure 8.3.2.) It will be denoted by L, or by In. 


Since E and L are inverse functions, we have 


(Lo E)(x) =x forall xeR 
and 


(E o L)\(y) =y forall ye R,y>0. 
These formulas may also be written in the form 
Ine’ =x, e™ =y. 


8.3.9 Theorem The logarithm is a strictly increasing function L with domain 
{x € R: x > 0} and range R. The derivative of L is given by 

(vii) L(x) =1/x for x>0. 

The logarithm satisfies the functional equation 

(viii) L(xy) = L(x) + L(y) for x > 0,y > 0. 

Moreover, we have 

(ix) L(1)=0 and L(e)=1, 

(x) L(x")=rL(x) for x>0,rEeQ 

(xi) lim L(x) =—oo and lim L(x) = œ. 
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Proof. That L is strictly increasing with domain {x € R : x > 0} and range R follows 
from the fact that E is strictly increasing with domain R and range {y € R : y > 0}. 


(vii) Since E’(x) = E(x) > 0, it follows from Theorem 6.1.9 that L is differentiable 
on (0, oo) and that 
1 1 1 


HON Gena) (Sli) 1 oe 


(viii) Ifx > 0, y > 0, let u := L(x) and v := L(y). Then we have x = E(u) and y = EQ). 

It follows from property (iv) of Theorem 8.3.6 that 
xy = E(u)E(v) = E(u + v), 

so that L(xy) = (Lo E)(u + v) = u + v = L(x) + L(y). This establishes (viii). 

The properties in (ix) follow from the relations E(0) = 1 and E(1) = e. 

(x) This result follows from (viii) and Mathematical Induction for n € N, and is 
extended to r € Q by arguments similar to those in the proof of 8.3.6(v). 

To establish property (xi), we first note that since 2 < e, then lim(e”) = oo and 
lim(e~”) = 0. Since L(e”) = n and L(e~”) = —n it follows from the fact that L is strictly 
increasing that 


lim L(x) = limL(e") =co and lim L(x) = limL(e~”) = —oo. QED. 


X00 x—0+ 


Power Functions 


In Definition 5.6.6, we discussed the power function x +> x", x > 0, where r is a rational 
number. By using the exponential and logarithm functions, we can extend the notion of 
power functions from rational to arbitrary real powers. 


8.3.10 Definition If a € R and x > 0, the number x“ is defined to be 
x := eX = E(aL(x)). 
The function x ++ x“ for x > 0 is called the power function with exponent a. 


Note If x > 0 and æ = m/n where m € Z, n € N, then we defined x“ := (x)! in 
Section 5.6. Hence we have In x” = a In x, whence x” = e" * = e” "*, Hence Definition 
8.3.10 is consistent with the definition given in Section 5.6. 


We now state some properties of the power functions. Their proofs are immediate 
consequences of the properties of the exponential and logarithm functions and will be left 
to the reader. 


8.3.11 Theorem /f a € R and x, y belong to (0, œœ), then: 
(a) 1%=1, (b) x* >0, 

(e) (xy)*= xy d) (x/y)* = xt 

8.3.12 Theorem Jf a, 6 € R and x € (0, œœ), then: 


(a) xH = x%x8 (b) (x®)P = x = (xb) 
(c) x %=1/x% (d) ifa<B, thenx® <x*®forx>1. 


The next result concerns the differentiability of the power functions. 
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8.3.13 Theorem Leta € R. Then the function x +> x on (0, co) to R is continuous and 
differentiable, and 


DX” = ax' 


Proof. By the Chain Rule we have 


for x € (0,00). 


Dx® = De® ™* = e* ™*. D(a In x) 


=x". Z = gx”! for x € (0,00). Q.E.D. 


x 


It will be seen in an exercise that if œ > 0, the power function x +> x“ is strictly increasing 
on (0, œo) to R, and that if œ < 0, the function x + x“ is strictly decreasing. (What happens 
if a = 0?) 

The graphs of the functions x +» x“ on (0, œo) to R are similar to those in Figure 5.6.8. 


The Function log, 


If a > 0, aF 1, it is sometimes useful to define the function logy. 
8.3.14 Definition Let a > 0, a+ 1. We define 


l 
log,(x) := — for x € (0,00). 
For x € (0, 00), the number log,(x) is called the logarithm of x to the base a. The case a = 
e yields the logarithm (or natural logarithm) function of Definition 8.3.8. The case a = 10 
gives the base 10 logarithm (or common logarithm) function logio often used in 


computations. Properties of the functions log, will be given in the exercises. 


Exercises for Section 8.3 


1. Show that if x > 0 and if n > 2x, then 


k TERN oe p 2x”+1 
aooo n (n+1)! 


Use this formula to show that 24 <e< 24, hence e is not an integer. 


2. Calculate e correct to five decimal places. 


Show that if 0 < x < a and n € N, then 


x . x x ex 
—<e<14 prai H 


x N 
De erig 


4. Show that if n > 2, then 


0 < en! Pee | pie 
` eee i ‘ny n+l ` 


Use this inequality to prove that e is not a rational number. 


5. If x > 0 andn €N, show that 


n 
1 =j gu 9 Beal ( x)! n (—x) 
x+1 


Use this to show that 
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and that 


In(x 


2 3 n n+l 
1) Pare a TEN paž ee 
XIB n n+1 
6. Use the formula in the preceding exercise to calculate In 1.1 and In 1.4 accurate to four decimal 


places. How large must one choose n in this inequality to calculate In 2 accurate to four decimal 
places? 


7. Show that In(e/2) = 1 — In 2. Use this result to calculate In 2 accurate to four decimal places. 


. Let f : R — R be such that f'(x) = f(x) for all x € R. Show that there exists K € R such that 
f(x) = Ke for all x € R. 


9. Leta, >Ofork =1,...,mand let A := (a; +--+ a)/n be the arithmetic mean of these 
numbers. For each k, put x, := a;/A — 1 in the inequality 1 + x < e*. Multiply the resulting 
terms to prove the Arithmetic-Geometric Mean Inequality 


(6) (ay ay)!" < (ay +--+ + ay). 

Moreover, show that equality holds in (6) if and only if a; = ad) =--: = an. 
10. Evaluate L'(1) by using the sequence (1 + 1/n) and the fact that e = lim((1 + 1/n)"). 
11. Establish the assertions in Theorem 8.3.11. 
12. Establish the assertions in Theorem 8.3.12. 


13. (a) Show that if a > 0, then the function x +> x“ is strictly increasing on (0, co) to R and that 


lim x“ = Qand lim x” = oo. 
x30+ x00 


(b) Show that if a < 0, then the function x +> x” is strictly decreasing on (0, co) to R and that 


lim x“ = ooand lim x“ = 0. 
x30+ X00 


14. Prove that if a > 0, a Æ 1, then a!°&«* = x for all x € (0, 00) and log,(a’) = y for all y € R. 
Therefore the function x +> log,x on (0, oo) to R is inverse to the function y+ æ on R. 


15. Ifa>0,a¥4 1, show that the function x +> log,x is differentiable on (0, oo) and that D log,x = 
1/(x In a) for x € (0, 00). 


16. Ifa > 0, a+ 1, and x and y belong to (0, 00), prove that log, (xy) = log,x + logay. 
17. Ifa>0,a41, and b > 0, b ¥ 1, show that 


Inb 
log,x = (F) os, for x € (0,00). 
Ina 


In particular, show that logiox = (In e/In 10) In x = (logoe) In x for x € (0, 0). 


Section 8.4 The Trigonometric Functions 


Along with the exponential and logarithmic functions, there is another very important 
collection of transcendental functions known as the “trigonometric functions.” These are 
the sine, cosine, tangent, cotangent, secant, and cosecant functions. In elementary courses, 
they are usually introduced on a geometric basis in terms of either triangles or the unit 
circle. In this section, we introduce the trigonometric functions in an analytical manner and 
then establish some of their basic properties. In particular, the various properties of the 
trigonometric functions that were used in examples in earlier parts of this book will be 
derived rigorously in this section. 

It suffices to deal with the sine and cosine since the other four trigonometric functions 
are defined in terms of these two. Our approach to the sine and cosine is similar in spirit to 
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our approach to the exponential function in that we first establish the existence of functions 
that satisfy certain differentiation properties. 


8.4.1 Theorem There exist functions C : R —> R and S : R —> R such that 


(i) C(x) = —C(x) and S"(x) = —S(x) forall x € R. 
(ii) C(0) = 1, C'(0) = 0, and S(0) = 0, S'(0) = 1. 


Proof. We define the sequences (C,,) and (S,,) of continuous functions inductively as 
follows: 


(1) Ci(x) := 1, S\(x) := x, 


(3) Gaias f " s,(t)dt, 


for alln EN, x ER. 

One sees by Induction that the functions C, and S, are continuous on R and hence they 
are integrable over any bounded interval; thus these functions are well-defined by the above 
formulas. Moreover, it follows from the Fundamental Theorem 7.3.5 that S,, and C,,., are 
differentiable at every point and that 


(4) S(x) = C(x) and C1 (x) = —Sn(x) fr nEN,xER. 


Induction arguments (which we leave to the reader) show that 


2 A 2n 
n =1 t od ý ry 
Cral) a" 4 +(-1) Gai 
x3 x ; xt 
Let A > 0 be given. Then if |x| < A and m > n > 2A, we have that (since A/2n < 1/4): 
xn x2nt2 x2m-2 
(5) ICin(x) — Ca(x)| = Gal Gatait ln- z 
AZ" A 2 A 2m—2n—2 
E Wy (ee ag (en reer ER 
(2n)! 2n 2n 


Since lim(A”/(2n)!) = 0, the sequence (C,,) converges uniformly on the interval [—A, A], 
where A > 0 is arbitrary. In particular, this means that (C,,(x)) converges for each x € R. 
We define C : R — R by 

C(x) := lim C,,(x) for x ER. 


It follows from Theorem 8.2.2 that C is continuous on R and, since C,(0) = 1 foralln € N, 
that C(0) = 1. 
If |x| < A and m > n > 2A, it follows from (2) that 


Soren = J eG 
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If we use (5) and Corollary 7.3.15, we conclude that 


Az 16 
s(x) SiC] SS (784), 


whence the sequence (S,,) converges uniformly on [—A, A]. We define S$: R — R by 
S(x) := lim S,,(x) for x ER. 


It follows from Theorem 8.2.2 that S is continuous on R and, since S,(0) = 0 for all n € N, 
that S(0) = 0. 

Since C(x) = —S,_1(x) for n > 1, it follows from the above that the sequence 
(Cn) converges uniformly on [—A, A]. Hence by Theorem 8.2.3, the limit function C is 
differentiable on [—A, A] and 


C(x) = lim Ch(x) = lim(—S,_-1(x)) = —S(x) for x € [—A, A]. 
Since A > 0 is arbitrary, we have 
(6) C'(x)=—S(x) for xER. 


A similar argument, based on the fact that S;,(x) = C,(x), shows that S is differentiable on 
R and that 


(7) S'(x)=C(x) forall xeER. 
It follows from (6) and (7) that 
C"(x) =—(S(x))'=—C(x) and $"(x) = (C(x)! = -S(x) 
for all x € R. Moreover, we have 
c'(0) = —S(0) = 0, S'(0) = C(0) = 1. 


Thus statements (i) and (ii) are proved. Q.E.D. 


8.4.2 Corollary JfC, S are the functions in Theorem 8.4.1, then 
Gii) C'(x)=-—S(x) and S'(x) = C(x) for x ER. 
Moreover, these functions have derivatives of all orders. 
Proof. The formulas (iii) were established in (6) and (7). The existence of the higher order 
derivatives follows by Induction. Q.E.D. 
8.4.3 Corollary The functions C and S satisfy the Pythagorean Identity: 
(iv) (C(x)? + (S(x))? = 1forx € R. 
Proof. Let f(x) := (C(x))’ + (S(x))? for x € R, so that 
f(x) = 2C(x)(—S(x)) + 28(x)(C(x))=0 fo xER. 


Thus it follows that f(x) is a constant for all x € R. But since f(0) = 1 +0 = 1, we 
conclude that f(x) = 1 for all x € R. QED. 


We next establish the uniqueness of the functions C and S. 


8.4.4 Theorem The functions C and S satisfying properties (i) and (ii) of Theorem 8.4.1 
are unique. 
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Proof. Let C, and C be two functions on R to R that satisfy C(x) = —C;(x) for all 
x € Rand C;(0) = 1, G/(0) = O for j = 1,2. If we let D := Cı — Cp, then D’(x) = —D(x) 
for x € R and D(0) = 0 and D® (0) = 0 for all k € N. 

Now let x € R be arbitrary, and let Zy be the interval with endpoints 0, x. Since D = 
Cı — Cy and T := S; — Sy = Cy — Ci are continuous on /,, there exists K > 0 such that 
|D(t)| < K and |T(t)| < K forall t € Iy. If we apply Taylor’s Theorem 6.4.1 to D on J, and 
use the fact that D(0) = 0, D® (0) = 0 for k € N, it follows that for each n € N there is a 
point c, € Ix such that 


/ (n—1) (n) Cn 
D(x) = D(0) +20) + +e — y yy? < de 
& D m (cy) x" 


Now either D” (cp) = +D(cn) or D” (cy) = T (cy). In either case we have 


K|x|" 
ni ` 


ID(x)| < 


But since lim (|x|"/m!) = 0, we conclude that D(x) = 0. Since x € R is arbitrary, we infer 
that Cı (x) — Co(x) = 0 for all x € R. 

A similar argument shows that if Sı and S, are two functions on R —> R such that 
Si(x) = —S;(x) for all x € R and $;(0) = 0, S;(0) = 1 for j = 1, 2, then we have S, (x) = 
S>(x) for all x € R. QED. 


Now that existence and uniqueness of the functions C and S have been established, we 


shall give these functions their familiar names. 


8.4.5 Definition The unique functions C : R— R and S : R — R such that C’(x) = 
—C(x) and $”(x) = —S(x) for all x €R and C(0) = 1, C’'(0) =0, and S(0) =0, 
S’(0) = 1, are called the cosine function and the sine function, respectively. We ordinarily 
write 


cos x := C(x) and sin x := S(x) for x ER. 


The differentiation properties in (i) of Theorem 8.4.1 do not by themselves lead to 
uniquely determined functions. We have the following relationship. 


8.4.6 Theorem Jff :R — R is such that 
f'(x) =—-f(x) for x ER, 
then there exist real numbers a, B such that 


f(x) = aC(x) + BS(x) for xER. 
Proof. Let g(x) :=f(0)C(x) + f'(0)S(x) for x ER. It is readily seen that g”(x) = 
—g(x) and that g(0) = f(0), and since 
g'(x) = —f(0)S(x) + f'(O)C(x), 


that g'(0) = f’(0). Therefore the function h := f — g is such that h(x) = —A(x) for all 
x € Rand h(0) = 0, h'(0) = 0. Thus it follows from the proof of the preceding theorem 
that A(x) = 0 for all x € R. Therefore f(x) = g(x) for all x € R. QED. 
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We shall now derive a few of the basic properties of the cosine and sine functions. 


8.4.7 Theorem The function C is even and S is odd in the sense that 

(v) C(—x) = C(x) and S(—x) = —S(x) for x ER. 

If x, y € R, then we have the “addition formulas” 

(vi) C(x +y) = CCO) — SSO), Sy) = SCO) + C(x) S(y). 

Proof. (v) If (x) := C(—x) for x € R, then a calculation shows that yg” (x) = —g(x) for 


x € R. Moreover, g(0) = 1 and g'(0) = 0 so that g = C. Hence, C(—x) = C(x) for all 
x € R. In a similar way one shows that S(—x) = —S(x) for all x € R. 


(vi) Let y € R be given and let f(x) := C(x + y) for x € R. A calculation shows that 
f" (x) = —f(x) for x € R. Hence, by Theorem 8.4.6, there exists real numbers a, 6 such 
that 


f(x) = C(x +y) = aC(x) + BS(x) and 
F(x) = -S(x + y) = —aS(x) + BC(x) 


for x € R. If we let x = 0, we obtain C(y) = a and —S(y) = £, whence the first formula in 


(vi) follows. The second formula is proved similarly. 
Q.E.D. 


The following inequalities were used earlier (for example, in 4.2.8). 
8.4.8 Theorem /f x €R, x> 0, then we have 
(vii) —x < S(x) < x; (viii) 1 — 4x? < C(x) < 1; 
(ix) x- 4x < S(x) < x; (x) 1—4x° < C(x) <1-32x° + 4m". 


Proof. Corollary 8.4.3 implies that —1 < C(t) < 1 for t € R, so that if x > 0, then 


x 
=x < | C(t)dt < x, 
0 
whence we have (vii). If we integrate (vii), we obtain 
x 
-4x < f S(t) dt < 42°, 
whence we have 
=4x < -C(x) +1 < 5x’. 


Thus we have 1 — 5x? < C(x), which implies (viii). 
Inequality (ix) follows by integrating (viii), and (x) follows by integrating (ix). Q.E.D. 


The number z is obtained via the following lemma. 


8.4.9 Lemma There exists a root y of the cosine function in the interval (V2, V3). 
Moreover C(x) > 0 for x € [0,y). The number 2y is the smallest positive root of S. 


Proof. Inequality (x) of Theorem 8.4.8 implies that C has a root between the positive root 
V2 of x? —2=0 and the smallest positive root of x* — 12x? + 24 = 0, which is 


V6 — 2/3 < V3. We let y be the smallest such root of C. 
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It follows from the second formula in (vi) with x = y that S(2x) = 2S(x)C(x). This 
relation implies that S(2y) = 0, so that 2y is a positive root of S. The same relation implies 
that if 26 > 0 is the smallest positive root of S, then C(5) = 0. Since y is the smallest 
positive root of C, we have ô = y. QED. 


8.4.10 Definition Let a := 2y denote the smallest positive root of S. 
Note The inequality V2 < y < v6 — 2V3 implies that 2.828 < m < 3.185. 


8.4.11 Theorem The functions C and S have period 21 in the sense that 
(xi) C(x + 22) = C(x) and S(x + 27) = S(x) for x € R. 
Moreover we have 


(xii) S(x) = Chir — x) =—C(x+4m), C(x) = S(x — x) = S(x +4n) for all 
xER. 


Proof. (xi) Since S(2x) = 2S(x)C(x) and S(x) = 0, then S(2z) = 0. Further, if x = y 
in (vi), we obtain C(2x) = (C(x))* — (S(x)}. Therefore C(27) = 1. Hence (vi) with 
y = 2r gives 

C(x + 27) = C(x)C(2z) — S(x)S(2r) = C(x), 


and 


S(x + 27) = S(x)C(2z) + C(x)S(2r) = S(x). 


(xii) We note that C (52) = 0, and it is an exercise to show that S G r) = 1. If we 
employ these together with formulas (vi), the desired relations are obtained. Q.E.D. 


Exercises for Section 8.4 


1. Calculate cos(.2), sin(.2) and cos 1, sin 1 correct to four decimal places. 
2. Show that |sin x| < 1 and |cos x| < 1 for all x € R. 


Show that property (vii) of Theorem 8.4.8 does not hold if x < 0, but that we have |sin x| < |x| 
for all x € R. Also show that |sin x — x| < |x|?/6 for all x € R. 
4. Show that if x > 0 then 
oo xh 38 52 


X 
(et 2 Se 
214 7292 8S) a tm 


Use this inequality to establish a lower bound for z. 


5. Calculate x by approximating the smallest positive zero of sin. (Either bisect intervals or use 
Newton’s Method of Section 6.4.) 


6. Define the sequence (c,) and (s,,) inductively by cı (x) := 1, sı(x) := x, and 


Sn(X) = | Cn(t)dt,  Cnai(x) = i+ f Sn(t)dt 


for all n € N, x € R. Reason as in the proof of Theorem 8.4.1 to conclude that there exist 
functions c : R — R and s : R —> R such that (j) c” (x) = c(x) and s” (x) = s(x) for all x € R, 
and (jj) c(0) = 1, (0) = 0 and s(0) = 0, s’ (0) = 1. Moreover, c'(x) = s(x) and s'(x) = c(x) 
for all x € R. 
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7. Show that the functions c, s in the preceding exercise have derivatives of all orders, and that they 
satisfy the identity (c(x))” — (s(x))’ = 1 for all x € R. Moreover, they are the unique functions 
satisfying (j) and (jj). (The functions c, s are called the hyperbolic cosine and hyperbolic sine 
functions, respectively.) 


8. Iff:R— R is such that f”(x) = f(x) for all x € R, show that there exist real numbers a, 8 
such that f(x) = ac(x) + Bs(x) for all x € R. Apply this to the functions f(x) := e* and 
falx) := e™ for x € R. Show that c(x) = }(e* + e*) and s(x) = 4 (e” — e™) for x € R. 


9. Show that the functions c, s in the preceding exercises are even and odd, respectively, and that 
e(x + y) = e(x)e(y) + s(x)sy), sŒ + y) = sc) + e(x)s(9), 
for all x,y € R. 


10. Show that c(x) > 1 for all x € R, that both c and s are strictly increasing on (0, co), and that 
lim c(x) = lim s(x) = co. 
k ae] X00. 


CHAPTER 9 


INFINITE SERIES 


In Section 3.7 we gave a brief introduction to the theory of infinite series. The reader will do 
well to look over that section at this time, since we will not repeat the definitions and results 
given there. 

Instead, in Section 9.1 we will introduce the important notion of the ‘absolute 
convergence” of a series. In Section 9.2 we will present some “‘tests’” for absolute 
convergence that will probably be familiar to the reader from calculus. The third section 
gives a discussion of series that are not absolutely convergent. In the final section we study 
series of functions and will establish the basic properties of power series, which are very 
important in applications. 


Section 9.1 Absolute Convergence 


We have already met (in Section 3.7) a number of infinite series that are convergent and 
others that are divergent. For example, in Example 3.7.6(b) we saw that the harmonic 
series: 


is divergent since its sequence of partial sums s,, := ++ 5 +-+ 1 (n € N) is unbounded. 
On the other hand, we saw in Example 3.7.6(f) that the alternating harmonic series: 


yer 

n=1 n 

is convergent because of the subtraction that takes place. Since 
(y 
on | n 

these two series illustrate the fact that a series ` x, may be convergent, but the series X2 |x] 


obtained by taking the absolute values of the terms may be divergent. This observation leads 
us to an important definition. 


9.1.1 Definition Let X := (x,) be a sequence in R. We say that the series X` xn is absolutely 
convergent if the series X` |x,| is convergent in R. A series is said to be conditionally (or 
nonabsolutely) convergent if it is convergent, but it is not absolutely convergent. 


It is trivial that a series of positive terms is absolutely convergent if and only if it is 


convergent. We have noted above that the alternating harmonic series is conditionally 
convergent. 
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9.1.2 Theorem Jfa series in R is absolutely convergent, then it is convergent. 
Proof. Since X` |x,| is convergent, the Cauchy Criterion 3.7.4 implies that, given > 0 
there exists M(e) € N such that if m > n > M(s), then 

Lentil + [taal toe + bel <6. 


However, by the Triangle Inequality, the left side of this expression dominates: 


Sin = Sn| = |Xn+1 + Xng2 Feet Xml. 


Since ¢ > 0 is arbitrary, Cauchy’s Criterion implies that X` x, converges. Q.E.D. 


Grouping of Series 


Given a series X` x,, we can construct many other series X` y, by leaving the order of the 
terms x, fixed, but inserting parentheses that group together finite numbers of terms. For 
example, the series indicated by 


earn cee ey eam E. me haat 
2 \3 4 5 6 7) 8 \9 13 


is obtained by grouping the terms in the alternating harmonic series. It is an interesting fact 
that such grouping does not affect the convergence or the value of a convergent series. 


9.1.3 Theorem Jf a series X` xn is convergent, then any series obtained from it by 
grouping the terms is also convergent and to the same value. 


Proof. Suppose that we have 
yp r= X1 te + XK, Y2 = X41 ts + Xk, 
If s, denotes the nth partial sum of X` x, and ty denotes the kth partial sum of $- y;,, then we have 
ti = Yi = Sk, y=), + y2 = Sh, 


Thus, the sequence (t4) of partial sums of the grouped series }~ y4 is a subsequence of the 
sequence (s,,) of partial sums of X` xn. Since this latter series was assumed to be convergent, 
so is the grouped series X` yz. Q.E.D. 


It is clear that the converse to this theorem is not true. Indeed, the grouping 
(=1)+(1-1)+ (1-1) + 


produces a convergent series from 5 (—1)”, which was seen to be divergent in Example 
n=0 
3.7.2(b) since the terms do not approach 0. 


Rearrangements of Series 


Loosely speaking, a “rearrangement” of a series is another series that is obtained from the 
given one by using all of the terms exactly once, but scrambling the order in which the 
terms are taken. For example, the harmonic series has rearrangements 


Dede od. dl 1 1 
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The first rearrangement is obtained from the harmonic series by interchanging the first and 
second terms, the third and fourth terms, and so forth. The second rearrangement is 
obtained from the harmonic series by taking one ‘‘odd term,” two “even terms,” three 
“odd terms,’ and so forth. It is obvious that there are infinitely many other possible 
rearrangements of the harmonic series. 


9.1.4 Definition A series ` y, in R is a rearrangement of a series )~ x, if there is a 
bijection f of N onto N such that yk = xpo for all k € N. 


While a grouping series does not affect the convergence of a series, making rearrange- 
ments may do so. If fact, there is a remarkable observation, due to Riemann, that if SS Sy, isa 
conditionally convergent series in R, and if c € Ris arbitrary, then there is a rearrangement 
of X` x, that converges to c. 

To prove this assertion, we first note that a conditionally convergent series must contain 
infinitely many positive terms and infinitely many negative terms (see Exercise 1), and that 
both the series of positive terms and the series of negative terms diverge (see Exercise 2). To 
construct a series converging to c, we take positive terms until the partial sum is greater than c, 
then we take negative terms until the partial sum is less than c, then we take positive terms 
until the partial sum is greater than c, then we take negative terms, etc. 


In our manipulations with series, we generally want to be sure that rearrangements will not 
affect the convergence or the value of the series. That is why the following result is important. 


9.1.5 Rearrangement Theorem Let X` x, be an absolutely convergent series in R. 
Then any rearrangement X` y, of >> Xn converges to the same value. 


Proof. Suppose that X` x, converges to x € R. Thus, if ¢ > 0, let N be such that if 
n, q > Nand Sn := x; + +- + Xn, then 


q 
|X — Sa| < € and > |x| <€. 
k=N+1 


Let M € N be such that all of the terms x1, . . . , Xy are contained as summands in ty := 
yi +: + ym. It follows that if m > M, then tm — Sn is the sum of a finite number of terms 
x, with index k > N. Hence, for some q > N, we have 


q 


[tmn — Snl < 5 |x| < €. 


k=N+1 
Therefore, if m > M, then we have 
\tm — x| < |tm — Sp| + |S, — x| < € + £ = 2e. 


Since ¢ > 0 is arbitrary, we conclude that X` y, converges to x. QED. 


Exercises for Section 9.1 


1. Show that if a convergent series contains only a finite number of negative terms, then it is 
absolutely convergent. 


2. Show that if a series is conditionally convergent, then the series obtained from its positive terms 
is divergent, and the series obtained from its negative terms is divergent. 
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If X` an is conditionally convergent, give an argument to show that there exists a rearrangement 
whose partial sums diverge to oo. 


Where is the fact that the series X` x, is absolutely convergent used in the proof of 9.1.5? 


If > a, is absolutely convergent, is it true that every rearrangement of ` a, is also absolutely 


convergent? 
[e6] 


Find an explicit expression for the nth partial sum of 5 In(1 —1/ n°) to show that this series 
converges to —In 2. Is this convergence absolute? n=2 


(a) If X` an is absolutely convergent and (b,) is a bounded sequence, show that X` a,b, is 
absolutely convergent. 

(b) Give an example to show that if the convergence of ` a, is conditional and (b,) is a 
bounded sequence, then X` a,b, may diverge. 


Give an example of a convergent series J` a, such that X` a is not convergent. (Compare this 
with Exercise 3.7.11.) 


If (a,) is a decreasing sequence of strictly positive numbers and if X` a, is convergent, show that 
lim(na,) = 0. 


Give an example of a divergent series ` a, with (a,) decreasing and such that lim(na,) = 0. 
If (a„) is a sequence and if lim(na,,) exists in R, show that X` an is absolutely convergent. 
Let a > 0. Show that the series X` (1 + a”)! is divergent if 0 < a < 1 and is convergent if a> 1. 
(54 £) converge? 

aa 


converge? 
n 


[0.0] 
(a) Does the series 5 


n=1 


[0.6] 
(b) Does the series 5 
n=1 
If (a;,) is a subsequence of (a,,), then the series )~ an, is called a subseries of X` a,. Show that 
Xan is absolutely convergent if and only if every Subseries of it is convergent. 
Leta: N x N — Rand write a; := a(i, j). If Aj := 5 aj for each i € N and if A := XA, we 
Gal œ oœ i=l 
say that A is an iterated sum of the a;; and write A = 5 5 aij. We define the other iterated 
œ 90 i=l j=l 
sum, denoted by Ya 5 aj, in a similar way. 
j=l i=l 
Suppose a; > 0 for i, j € N. If (cx) is any enumeration of {aj : i, j € N}, show that the 
following statements are equivalent: 


oo oo 
(i) The iterated sum 5 5 ay converges to B. 
[e0] i=1 J=1 
(ii) The series 5 Ck converges to C. 
k=1 
In this case, we have B = C. 
The preceding exercise may fail if the terms are not positive. For example, let a; := +1 ifi—j= 
1, aj := —1 if i — j = —1, and a; := 0 elsewhere. Show that the iterated sums 
CO oe) Co Co 
dua ad YD ay 
i=l j=l j=l i=l 


both exist but are not equal. 


Section 9.2 Tests for Absolute Convergence 


In Section 3.7 we gave some results concerning the convergence of infinite series; namely, 
the nth Term Test, the fact that a series of positive terms is convergent if and only if its 
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sequence of partial sums is bounded, the Cauchy Criterion, and the Comparison and Limit 
Comparison Tests. 

We will now give some additional results that may be familiar from calculus. These 
results are particularly useful in establishing absolute convergence. 


9.2.1 Limit Comparison Test, II Suppose that X := (x,) and Y := (yn) are nonzero real 
sequences and suppose that the following limit exists in R: 


Xn 


1 r := lim 
( ) Yn 


(a) If r 4 0, then X` x, is absolutely convergent if and only if X` y, is absolutely 
convergent. 


(b) Ifr = 0 and if X` y, is absolutely convergent, then X` xn is absolutely convergent. 
Proof. This result follows immediately from Theorem 3.7.8. Q.E.D. 


The Root and Ratio Tests 


The following test is due to Cauchy. 


9.2.2 Root Test Let X := (x,) be a sequence in R. 

(a) If there exist r € R withr < 1 and K € N such that 
(2) [xn]! <r for n>K, 
then the series X` x, is absolutely convergent. 

(b) If there exists K € N such that 

(3) xl" >1 J n>K, 


then the series X` xn is divergent. 


Proof. (a) If (2) holds, then we have |x,,| < r” for n > K. Since the geometric series X` r” 
is convergent for 0 < r < 1, the Comparison Test 3.7.7 implies that X` |x,,| is convergent. 


(b) If (3) holds, then |x,,| > 1 for n > K, so the terms do not approach 0 and the nth 
Term Test 3.7.3 applies. QED. 


In calculus courses, one often meets the following version of the Root Test. 


9.2.3 Corollary Let X := (xn) be a sequence in R and suppose that the limit 

(4) r:=lim|x,|!/" 

exists in R. Then X` xn is absolutely convergent when r < 1 and is divergent when r > 1. 
Proof. If the limit in (4) exists and r < 1, then there exist rı with r < rı < 1 and K € N 
such that lx” < rı for n > K. In this case we can apply 9.2.2(a). 


If r > 1, then there exists K € N such that |x,,|!/" > 1 for n > K and the nth Term Test 
applies. Q.E.D. 
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Note No conclusion is possible in Corollary 9.2.3 when r = 1, for either convergence or 
divergence is possible. See Example 9.2.7(b). 


Our next test is due to D’ Alembert. 


9.2.4 Ratio Test Let X := (x,) be a sequence of nonzero real numbers. 


(a) If there exist r € R withO < r < 1 and K € N such that 


Xn+1 
Xn 


(5) 


<r for n>K, 


then the series X` x, is absolutely convergent. 
(b) If there exists K € N such that 


(6) 


then the series X` x, is divergent. 


Xn+1 
Xn 


Proof. (a) If (5) holds, an Induction argument shows that |Xx}m| < |xglr” for m € N. 
Thus, for n > K the terms in J- |x,| are dominated by a fixed multiple of the terms in the 
geometric series X` 7” with 0 < r < 1. The Comparison Test 3.7.7 then implies that $- |x,,| 
is convergent. 

(b) If (6) holds, an Induction argument shows that |Xx+m| > |x| for m € N and the nth 
Term Test applies. QED. 


Once again we have a familiar result from calculus. 


9.2.5 Corollary Let X := (x,) be a nonzero sequence in R and suppose that the limit 


Xn4+1 


7 := li 
(7) r im = 


exists in R. Then X` xX, is absolutely convergent when r < 1 and is divergent when r > 1. 


Proof. Ifr< landifr < r< 1, then there exists K € R such that |x,,41/x,| < rı for n > 
K. Thus Theorem 9.2.4(a) applies to give the absolute convergence of S> Xn. 

If r > 1, then there exists K € N such that |x,41/x,|>1 for n > K, whence it follows 
that |x,| does not converge to 0 and the mth Term Test applies. Q.E.D. 


Note No conclusion is possible in Corollary 9.2.5 when r = 1, for either convergence or 
divergence is possible. See Example 9.2.7(c). 


The Integral Test 


The next test—a very powerful one—uses the notion of the improper integral, which is 
defined as follows: If fis in R[a, b] for every b > a and if the limit jim f? f(t)dt exists in 


R, then the improper integral [~ f(t)dt is defined to be this limit. 
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9.2.6 Integral Test Let fbe a positive, decreasing function on {t : t > 1}. Then the series 


5 f(k) converges if and only if the improper integral 
k=l 


oo b 
f Fdr= jim f f(ùdt 


CO 
exists. In the case of convergence, the partial sum Sn = Xo f(k) and the sum s = 5 f(k) 
k=1 k=1 
satisfy the estimate 


(8) f(tdt < s — Sn < Í f(Ðdt. 
n+1 n 
Proof. Since f is positive and decreasing on the interval [k — 1, k], we have 
k 
(9) f(k) < fe flats fle— 1). 
-1 
By adding this inequality for k = 2,3, ... , n, we obtain 


Sn —f(1) < fioa < Sais 


which shows that either both or neither of the limits 
n 


lim s, and lim / f(t)dt 


n—-oo n—-0o 1 


exist. If they exist, then on adding (9) fork = n + 1, ... , m, we obtain 
Sm — Sn < / ftdt < Sm—1 — Sn-1, 


from which it follows that 
m+1 


ftdt < Sm — Sn < f toar 


n+1 


If we take the limit in this last inequality as m — oo, we obtain (8). Q.E.D. 


We will now show how the results in Theorems 9.2.1—9.2.6 can be applied to the 
p-series, which were introduced in Example 3.7.6(d, e). 


9.2.7 Examples (a) Consider the case p = 2; that is, the series X` 1/n?. We compare it 
with the convergent series ` 1/(n(n + 1)) of Example 3.7.2(c). Since 
1 1 n+1 1 
i = =1 1 
n? ` n(nt+ 5 k f 


n n 


the Limit Comparison Test 9.2.1 implies that X` 1/n? is convergent. 
(b) We demonstrate the failure of the Root Test for the p-series. Note that 
11" 1 1 


nP z (nn = (ni/nyP* 


Since (see Example 3.1.11(d)) we know that nn 


the theorem does not give any information. 


— 1, we have r = 1 in Corollary 9.2.3, and 


274 CHAPTER 9 INFINITE SERIES 


(c) We apply the Ratio Test to the p-series. Since 
1 1 
(n+ 1)? ` we 


oOo 1 1 
(PI (EFIE 
the Ratio Test, in the form of Corollary 9.2.5, does not give any information. 


(d) Finally, we apply the Integral Test to the p-series. Let f (£) := 1/¢? for t > 1 and recall 
that 
"i 
—dt=Inn—Inl, 
1 ¢ 


1 1 1 
[iA 1) for pAl. 


From these relations we see that the p-series converges if p > 1 and diverges if p < 1, as we 
have seen before in 3.7.6(d, e). 


Raabe’s Test 


If the limits lim|x,,| 1/7 and lim(|X;41/Xn|) that are used in Corollaries 9.2.3 and 9.2.5 equal 
1, we have seen that these tests do not give any information about the convergence or 
divergence of the series. In this case it is often useful to employ a more delicate test. Here is 
one that is frequently useful. 


9.2.8 Raabe’s Test Let X := (x,) be a sequence of nonzero real numbers. 


(a) If there exist numbers a > 1 and K € N such that 


Xn+1 
Xn 


(10) <1-- for n>K, 


n 


then X` xn is absolutely convergent. 
(b) If there exist real numbers a < 1 and K € N such that 


(11) 


then X` x, is not absolutely convergent. 


Xn+1 
Xn 


>1-—- for n>K, 


Proof. (a) If the inequality (10) holds, then we have (after replacing n by k and 
multiplying) 


k|xk+1| < (k — 1)|xx| — (a — 1) |xe| for k>K. 
On reorganizing the inequality, we have 
(12) (k- 1)lxel— klixe] 2 (a-Il >0 for KEK, 


from which we deduce that the sequence (k|xk+1|) is decreasing for k > K. If we add (12) 
for k = K, ... , n and note that the left side telescopes, we get 


(K = 1) [xx] = mlxnpi] > (a = 1) (xx| + +++ + |xn]). 


This shows (why?) that the partial sums of X` |x, are bounded and establishes the absolute 
convergence of the series. 
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(b) If the relation (11) holds for n > K, then since a < 1, we have 
A\Xn41| > (n — a)|xn| > (n — 1)|xn] for n>K. 
Therefore the sequence (n|x„+1|) is increasing for n > K and there exists a number c > 0 


such that |x,,4;| > c/n for n > K. But since the harmonic series X` 1 /n diverges, the series 
>> |xn| also diverges. Q.E.D. 


In the application of Raabe’s Test, it is often convenient to use the following limiting 
form. 


9.2.9 Corollary Let X := (x,) be a nonzero sequence in R and let 


: wmf) 


whenever this limit exists. Then X` xX, is absolutely convergent when a > 1 and is not 
absolutely convergent when a < 1. 


Xn+1 
Xn 


Proof. Suppose the limit in (13) exists and that a > 1. If a, is any number with a > a, > 1, 
then there exists K € N such that a; < n(1 — |Xn+1/Xnl) forn > K. Therefore |x,4.1/Xn| < 1 — 
a/n for n > K and Raabe’s Test 9.2.8(a) applies. 

The case where a < 1 is similar and is left to the reader. Q.E.D. 


Note There is no conclusion when a = 1; either convergence or divergence is possible, as 
the reader can show. 


9.2.10 Examples (a) We reconsider the p-series in the light of Raabe’s Test. Applying 
L’Hospital’s Rule when p > 1, we obtain (why?) 


oleate) let) 


: 
tee) eae) 


We conclude that if p > | then the p-series is convergent, and if 0 < p < | then the series is 
divergent (since the terms are positive). However, if p = 1 (the harmonic series!), Corollary 
9.2.9 yields no inoran 


(b) We now consider De 
=n at 

An easy calculation shows that lim(x„+1/Xn) = 1, so that Corollary 9.2.5 does not 

apply. Also, we have lim((1 — Xn+1/Xn)) = 1, so that Corollary 9.2.9 does not apply either. 

However, it is an exercise to establish the inequality x,41/X, > (n — 1)/n, from which it 


follows from Raabe’s Test 9.2.8(b) that the series is divergent. (Of course, the Integral Test, 
or the Limit Comparison Test with (y„) = (1/n), can be applied here.) 


Although the limiting form 9.2.9 of Raabe’s Test is much easier to apply, Example 
9.2.10(b) shows that the form 9.2.8 is stronger than 9.2.9. 


Remark Leonard Euler’s calculational prowess was remarkable, including his work in 


the area of infinite series. His methods are not covered in this book, but we will state one of 


his famous results. We know the series + + 2 + $ feed +--+ converges 


276 CHAPTER 9 INFINITE SERIES 


(Example 9.2.7(a)), but the problem of determining its exact value is quite difficult. The 


problem was known as the Basel problem, named after Basel University in Switzerland, 
2 


“1 
and Euler solved it in 1735 when he obtained the surprising result that 5 ae = He also 
—n 


described a process to derive a a of oe with ae even powers of 1/n. For 


example, he showed that D na Xo a _ However, the values of the 


corresponding series of bids ner of ane were not discovered and stay elusive to the 
present day. 

Another famous problem involves a number that is defined as follows. The harmonic 
series is known to diverge, but Euler observed that the sequence 

=) 1 1 1 l 
Cn := Taha ber nn 

is convergent. (See Exercise 15.) The limit of the sequence, y := lim (c,), is called Euler’s 
constant and is approximately equal to 0.5772156649. . . . The question of whether y is 
a rational or irrational number is a famous unsolved problem. Computers have calculated 
over two trillion decimal places, and though the prevailing opinion is that y is irrational, no 
proof has yet been found. 


Exercises for Section 9.2 


1. Establish the convergence or the divergence of the series whose nth term is: 


1 n 
a) an? b —.—.,, 
© GFD@+2 © FOEI 
(c) 271", (d) n/2”. 
2. Establish the convergence or divergence of the series whose nth term is: 
@ (n+ dy”, O (P+, 
(c) n!/n”, (d) (-1)"n/(n+ 1). 
3. Discuss the convergence or the divergence of the series with nth term (for sufficiently large n) 
given by 
(a) (Inn)?, (b) (Inn) ”, 
(c) (In n) —In a (d) (In n) —In In n 
(e) (nInn)', ®© (n(Inn)(In Inn)*)~! 
4. Discuss the convergence or the divergence of the series with nth term: 
(a) Pe”, b) ne", 
(©) cee, (a) (Inn)e-v", 
(e) nie", (f) nle”. 
5. Show that the series 1/17 + 1/2? + 1/3? + 1/4 + --- is convergent, but that both the Ratio 


and the Root Tests fail to apply. 
If a and b are positive numbers, then ` (an + b)? converges if p > 1 and diverges if p < 1. 


7. Discuss the series whose nth term is 


n! 5 (n)? 
O Tear O ay! 
PE ip, 2D 


3-5---(2n+1)’ 5-7---(2n+3)° 


14. 
15. 


16. 


17. 


18. 


19. 


20. 
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Let 0 < a < 1 and consider the series 


@+a+a to t-ta” party... 


Show that the Root Test applies, but that the Ratio Test does not apply. 


If r € (0,1) satisfies (2) in the Root Test 9.2.2, show that the partial sums s, of X` xn 
approximate its limit s according to the estimate |s — sn| < 7"+!/(1 — r) for n > K. 


If r € (0, 1) satisfies (5) in the Ratio Test 9.2.4, show that |s — sn| < r|xn|/(1 — r) for n > K. 
If a > 1 satisfies (10) in Raabe’s Test 9.2.8, show that |s — s„| < n|xn|/(a — 1) for n > K. 


For each of the series in Exercise 1 that converge, estimate the remainder if only four terms are 
taken; if only ten terms are taken. If we wish to determine the sum of the series within 1/1000, 
how many terms should be taken? 


Answer the questions posed in Exercise 12 for the series given in Exercise 2. 


Show that the series 1+5—4+4+{—-{4++-—--:- is divergent. 
For n € N, let c, be defined by c, := 1 H } +---+ 1/n — lnn. Show that (c,) is a decreasing 
sequence of positive numbers. Show that if we put 
i SR ee 1 
bn: tee 
"1 2'3 2n’ 
then the sequence (b,,) converges to In 2. [Hint: ba = Coy — Cy + In 2.] 
Let {n;, m, . . .} denote the collection of natural numbers that do not use the digit 6 in their 
decimal expansion. Show that X` 1/n, converges to a number less than 80. If {m;, m2, . . .} is 
the collection of numbers that end in 6, then X` 1/m, diverges. If {p1, p2, . . . } is the collection 


of numbers that do not end in 6, then X` 1/p, diverges. 
If p > 0, q > 0, show that the series 
ye 1) +2): (p +n) 
(¢+ D(q+2)---(a+n) 
converges for q > p + 1 and diverges for q < p + 1. 


Suppose that none of the numbers a, b, c is a negative integer or zero. Prove that the 
hypergeometric series 


ab alat+1)b(b+1) 


l!e 2!c(c + 1) 


a(a+1)(a+2)b(b+1)(b+2) 
3lce(c + 1)(c + 2) | 


is absolutely convergent for c > a + b and divergent for c < a+ b. 


Let a„ > 0 and suppose that X` an converges. Construct a convergent series X` b, with b, > 0 such that 
lim(a,,/b,,) = 0; hence ` b, converges less rapidly than X` a,,. [Hint: Let (A,) be the partial sums of 
X an and A its limit. Define bı := VA — VA — A; and b, := V/A—A,_1 — VA — A, forn > 1] 
Let (a„) be a decreasing sequence of real numbers converging to 0 and suppose that X` a, diverges. 


Construct a divergent series X` b, with b, > 0 such that lim(b,,/a,) = 0; hence X` bn diverges less 
rapidly than X an. [Hint: Let by := an/ vAn where A, is the nth partial sum of $- an.] 


Section 9.3 Tests for Nonabsolute Convergence 


The convergence tests that were discussed in the preceding section were primarily directed 
to establishing the absolute convergence of a series. Since there are many series, such as 


(1) 


oo (1) oo (-1 n+1 


$ 
n 
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that are convergent but not absolutely convergent, it is desirable to have some tests for this 
phenomenon. In this short section we shall present first the test for alternating series and 
then tests for more general series due to Dirichlet and Abel. 


Alternating Series 


The most familiar test for nonabsolutely convergent series is the one due to Leibniz that is 
applicable to series that are ‘‘alternating”’ in the following sense. 


9.3.1 Definition A sequence X := (x,,) of nonzero real numbers is said to be alternating 
if the terms (— Ly x, n € N, are all positive (or all negative) real numbers. If the sequence 
X = (xn) is alternating, we say that the series X` x, it generates is an alternating series. 


In the case of an alternating series, it is useful to set x, = (— DHZ, [or x, = (—1)"z,], 
where z, > 0 for all n € N. 


9.3.2 Alternating Series Test Let Z := (z,) be a decreasing sequence of strictly posi- 
tive numbers with lim(z„) = 0. Then the alternating series Y` (—1)"+!' z, is convergent. 
Proof. Since we have 

S2n = (Z1 — 22) + (Z3 — Z4) + + + (Z2n-1 — Zan), 


and since Z% — Zķ+1 > 0, it follows that the subsequence (s2) of partial sums is increasing. 
Since 


S2n = Z1 — (Z2 — 23) — +++ — (Zon—2 — Zon-1) — Zon, 


it also follows that s2, < zı for all n € N. It follows from the Monotone Convergence 
Theorem 3.3.2 that the subsequence (s,) converges to some number s € R. 
We now show that the entire sequence (s,,) converges to s. Indeed, if € > 0, let K be 


such that if n > K then |s2, — s| < 5 eand |Zon+41| < Fe. It follows that if n > K then 


[Sant — S| = |San + Z2n+1 — 5] 
< [Son — S| + |Zong1| < $E +e =e. 


Therefore every partial sum of an odd number of terms is also within € of s if n is large 
enough. Since ¢ > 0 is arbitrary, the convergence of (s„) and hence of >> ( hie is 
established. Q.E.D. 


Note Itis an exercise to show that if s is the sum of the alternating series and if s,, is its nth 
partial sum, then 


(2) |S — Sy] < Zn+1: 


It is clear that this Alternating Series Test establishes the convergence of the two series 
already mentioned, in (1). 


The Dirichlet and Abel Tests 


We will now present two other tests of wide applicability. They are based on the following 
lemma, which is sometimes called the partial summation formula, since it corresponds to 
the familiar formula for integration by parts. 
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9.3.3 Abel’s Lemma Let X := (x,) and Y := (y,) be sequences in R and let the partial 
sums of > y, be denoted by (s„) with so := 0. If m > n, then 


m m-1 


(3) XO xe = (Xm — Xna) + XO (Xk — X41) Se. 

k=n+1 k=n+1 
Proof. Since yk = Sk — Sk—ı for k = 1,2, ... , the left side of (3) is seen to be equal to 
y Xk(Sk — Sk-1 ). If we collect the terms multiplying Sn, 5,41, .-- > Sm, We obtain the 
aht side of (3). Q.E.D. 


We now apply Abel’s Lemma to obtain tests for convergence of series of the form 


D XnYn- 


9.3.4 Dirichlet’s Test Jf X := (xn) is a decreasing sequence with lim x, = 0, and if the 
partial sums (Sy) of X` y, are bounded, then the series X` Xy, is convergent. 


Proof. Let |s,| < B for all n € N. If m > n, it follows from Abel’s Lemma 9.3.3 and the 
fact that x, — Xk+1 > O that 


m m-1 
NO del S (Xm + Xn )B+ SO (xe — Xit )B 
k=n+1 k=n+1 
= [(%m + Xn41) + (Xn+1 — Xm) |B 


= 2xn41B. 


Since lim(x,) = 0, the convergence of X` x,y, follows from the Cauchy Convergence 
Criterion 3.7.4. Q.E.D. 


9.3.5 Abel’s Test If X := (xn) is a convergent monotone sequence and the series X. y, 
is convergent, then the series X` x,y, is also convergent. 


Proof. If (x,) is decreasing with limit x, let u, := x, — x, n € N, so that (u,) decreases to 0. 
Then x, = x + Un, Whence X,„,Yn = XYn + UnYn. It follows from the Dirichlet Test 9.3.4 that 
>> Uy, is convergent and, since X` xy, converges (because of the assumed convergence of 
the series $` y,,), we conclude that X` x,y, is convergent. 

If (x,) is increasing with limit x, let v,, := x — Xn, n € N, so that (v,) decreases to 0. Here 
Xn = X — Vy, Whence X„,Yn = XYn — Vy¥n, and the argument proceeds as before. Q.E.D. 


9.3.6 Examples (a) Since we have 


2(sin} x) (cos x +--+ + cos nx) = sin(n + 5)x — sin $ x, 


it follows that if x 4 2kr (k € N), then 


sin(n + 4)x — sind x 1 
E E R lka 23 


. 1 Mka . 1 a 
|2 sin5x| |sin 5 x| 
Hence Dirichlet’s Test implies that if (a„) is decreasing with lim(a,,) = 0, then the series 
[0.6] 


5 an cos nx converges provided x Æ 2kr. 


n=1 
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(b) Since we have 
2(sin}x) (sinx +--+ + sinnx) = cos{x — cos(n + tx, 
it follows that if x 4 2km (k € N), then 


[sinx +--+ + sinnx| < —{—. 
[sin4x| 


As before, if (a„) is decreasing and if lim(a,,) = 0, then the series 5 an sin nx converges for 
x Æ 2kr (and it also converges for these values). n=] 


Exercises for Section 9.3 


1. Test the following series for convergence and for absolute convergence: 


00 (iye (4 Ae 
Ol aaa (b) 2 an 
gees eyes 
; n=1 n+2 l n=1 R n`’ 


(oe) 
2. If.s,, is the nth partial sum of the alternating series DD (=n 


series, show that |s — s,| < Zn+1- n=1 


Zn, and if s denotes the sum of this 


3. Give an example to show that the Alternating Series Test 9.3.2 may fail if (z,,) is not a decreasing 
sequence. 


4. Show that the Alternating Series Test is a consequence of Dirichlet’s Test 9.3.4. 


5. Consider the series 


where the signs come in pairs. Does it converge? 


6. Let a, € R for n € N and let p < q. If the series X` a,/n” is convergent, show that the series 
X` a,/n! is also convergent. 


7. If p and q are positive numbers, show that X` (—1)” (In n}? /n? is a convergent series. 


8. Discuss the series whose nth term is: 


n” n” 
—] MOS a, b ee 
(a) ( ) (n af. iy! y ( ) (n i e 
n (n Tv 1)" (n ZE 1)” 
O 5 (Q) a 


9. If the partial sums of ` a, are bounded, show that the series 3 ane cers for t > 0. 
n=1 


10. If the partial sums s„ of ae are bounded, show that the series ya /n converges to 


See: +1). 


n=1 


n=1 n=1 


11. Can Dirichlet’s Test be applied to establish the convergence of 


i 1 1 1 1 1 
2 3 4 5 6 
where the number of signs increases by one in each “block”? If not, use another method to 
establish the convergence of this series. 
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12. Show that the hypothesis that the sequence X := (x,) is decreasing in Dirichlet’s Test 9.3.4 can 
CO 
be replaced by the hypothesis that 5 


n=1 
13. If (a,) is a bounded decreasing sequence and (0,,) is a bounded increasing sequence and if x, := 
[oe 


Xn — Xn+1| iS convergent. 


an + b, for n € N, show that 5 |Xn — Xn+ı| is convergent. 
n=1 foe) 
14. Show that if the partial sums s,, of the series Dz dx satisfy |s,| < Mn" for some r < 1, then the 
i 2 k=1 
series Xo an/n converges. 
n=1 
15. Suppose that X` a, is a convergent series of real numbers. Either prove that X` b, converges or 
give a counter-example, when we define b, by 


(a) a,/n, (b) Van/n (a, = 0), 
(c) asinn, (d) /an/n (an > 9), 
(e) n'an, Œ anr/(1+ lanl). 


Section 9.4 Series of Functions 


Because of their frequent appearance and importance, we now present a discussion of infinite 
series of functions. Since the convergence of an infinite series is handled by examining the 
sequence of partial sums, questions concerning series of functions are answered by examining 
corresponding questions for sequences of functions. For this reason, a portion of the present 
section is merely a translation of facts already established for sequences of functions into 
series terminology. However, in the second part of the section, where we discuss power series, 
some new features arise because of the special character of the functions involved. 


9.4.1 Definition If (f,)is a sequence of functions defined on a subset D of R with values 
in R, the sequence of partial sums (s,,) of the infinite series X` f,, is defined for x in D by 


In case the sequence (s,,) of functions converges on D to a function f, we say that the infinite 
series of functions J` f„ converges to f on D. We will often write 


[0.0] 
dfn ar NS, 
n=1 
to denote either the series or the limit function, when it exists. 


If the series X` |f„(x)| converges for each x in D, we say that X` f,„ is absolutely 
convergent on D. If the sequence (s,,) of partial sums is uniformly convergent on D to f, we 
say that X` f,„ is uniformly convergent on D, or that it converges to f uniformly on D. 

One of the main reasons for the interest in uniformly convergent series of functions is 
the validity of the following results, which give conditions justifying the change of order of 
the summation and other limiting operations. 


9.4.2 Theorem Jff, is continuous on D C R to R for eachn € N and if X` f, converges 
to f uniformly on D, then f is continuous on D. 
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This is a direct translation of Theorem 8.2.2 for series. The next result is a translation 
of Theorem 8.2.4. 


9.4.3 Theorem Suppose that the real-valued functions fa, n € N, are Riemann integra- 
ble on the interval J := [a, b]. If the series X` f„ converges to f uniformly on J, then f is 
Riemann integrable and 


(1) Po: 


Next we turn to the corresponding theorem pertaining to differentiation. Here we 
assume the uniform convergence of the series obtained after term-by-term differentiation 
of the given series. This result is an immediate consequence of Theorem 8.2.3. 


9.4.4 Theorem For eachn € N, let f, be a real-valued junction on J := [a, b] that has a 
derivative f’, on J. Suppose that the series X` f „ converges for at least one point of J and 
that the series of derivatives X` f’, converges uniformly on J. 

Then there exists a real-valued function f on J such that X` f „ converges uniformly on 
J to f. In addition, f has a derivative on J and f' =f". 


Tests for Uniform Convergence 


Since we have stated some consequences of uniform convergence of series, we shall now 
present a few tests that can be used to establish uniform convergence. 


9.4.5 Cauchy Criterion Let (f,,) be a sequence of functions on D C R to R. The series 
>of, is uniformly convergent on D if and only if for every e > 0 there exists an M(e) such 
that ifm > n > M(e), then 


\fnoi(X) H fne forall xeD. 
9.4.6 Weierstrass M-Test Let (M,,) be a sequence of positive real numbers such that 
\f,,(x)| < Mn for x € D, n EN. If the series X` M, is convergent, then X` f „ is uniformly 


convergent on D. 


Proof. If m > n, we have the relation 
[Frei X) +++ + hin(%)| < Magi +++ + Mn for x ED. 
Now apply 3.7.4, 9.4.5, and the convergence of X` Mp. QED. 


In Appendix E we will use the Weierstrass M-Test to construct two interesting examples. 


Power Series 


We shall now turn to a discussion of power series. This is an important class of series of 
functions and enjoys properties that are not valid for general series of functions. 


9.4.7 Definition A series of real functions `f, is said to be a power series around 
x = c if the function f;,, has the form 


falx) = An (Xx E es 


where a, and c belong to R and where n = 0,1, 2, .... 
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For the sake of simplicity of our notation, we shall treat only the case where c = 0. This 
is no loss of generality, however, since the translation x’ = x — c reduces a power series 
around c to a power series around 0. Thus, whenever we refer to a power series, we shall 
mean a series of the form 


(2) X ax" = ag + ax +06 Fax" Hoe 
n=0 
Even though the functions appearing in (2) are defined over all of R, it is not to be 
expected that the series (2) will converge for all x in R. For example, by using the Ratio 
Test 9.2.4, we can show that the series 


Co Co Co 
Xal", Xo, Sox nl, 
n=0 n=0 n=0 
converge for x in the sets 
{0}, {xeR: [x1 <1}, R, 


respectively. Thus, the set on which a power series converges may be small, medium, or 
large. However, an arbitrary subset of R cannot be the precise set on which a power series 
converges, as we shall show. 

If (b,,) is a bounded sequence of nonnegative real numbers, then we define the limit 
superior of (5,,) to be the infimum of those numbers v such that b, < v for all sufficiently 
large n € N. This infimum is uniquely determined and is denoted by lim sup(b,,). The only 
facts we need to know are (i) that if v > lim sup(b,), then b, < v for all sufficiently large 
n € N, and (ii) that if w < lim sup(b,), then w < b, for infinitely many n € N. (See 3.4.10 
and 3.4.11.) 


9.4.8 Definition Let X` a,x" be a power series. If the sequence (\ay|'/ ") is bounded, we 
set p := lim sup(|an|1/ "); if this sequence is not bounded we set p = +00. We define the 
radius of convergence of X` a,x” to be given by 


0 if p=-+0, 
R:=¢1/p if 0<p< +o, 
+œ if p=0. 


The interval of convergence is the open interval (—R, R). 
We shall now justify the term “radius of convergence.” 


9.4.9 Cauchy-Hadamard Theorem Zf R is the radius of convergence of the power 
series X` a,x", then the series is absolutely convergent if |x| < R and is divergent if 
|x| > R. 


Proof. We shall treat only the case where 0 < R < +oo, leaving the cases R = 0 and 
R = +00 as exercises. If 0 < |x| < R, then there exists a positive number c < 1 such that 
|x| < cR. Therefore p < c/|x| and so it follows that if n is sufficiently large, then 
|\an|'/" < c/|x|. This is equivalent to the statement that 


(3) janx"| < c 


for all sufficiently large n. Since c < 1, the absolute convergence of X` a,x” follows from 
the Comparison Test 3.7.7. 
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If |x| >R = 1/p, then there are infinitely many n € N for which |a,)|!/” > 1/|x|. 
Therefore, |a,,x”"| > 1 for infinitely many n, so that the sequence (a,,x”") does not converge 
to zero. QED. 


Remark It will be noted that the Cauchy-Hadamard Theorem makes no statement as to 
whether the power series converges when |x| = R. Indeed, anything can happen, as the 
examples 


n 1 n 1 n 
>r, Be Da 


show. Since lim(n ‘”) = 1, each of these power series has radius of convergence equal to 1. 
The first power series converges at neither of the points x = —1 and x = +1; the second 
series converges at x = — 1 but diverges at x = +1; and the third power series converges at 
both x = —1 and x = +1. (Find a power series with R = 1 that converges at x = +1 but 
diverges x = —1.) 

It is an exercise to show that the radius of convergence of the series 5 a,x” is also 
given by 


I/n 


an 


4 lim 
( ) dn+1 


E 


provided this limit exists. Frequently, it is more convenient to use (4) than Definition 9.4.8. 

The argument used in the proof of the Cauchy-Hadamard Theorem yields the uniform 
convergence of the power series on any fixed closed and bounded interval in the interval of 
convergence (—R, R). 


9.4.10 Theorem Let R be the radius of convergence of X` a,x" and let K be a closed and 
bounded interval contained in the interval of convergence (—R, R). Then the power series 
converges uniformly on K. 


Proof. The hypothesis on K C (—R, R) implies that there exists a positive constant c < 1 
such that |x| < cR for all x € K. (Why?) By the argument in 9.4.9, we infer that for 
sufficiently large n, the estimate (3) holds for all x € K. Since c < 1, the uniform 
convergence of X` a„x” on K is a direct consequence of the Weierstrass M-test with 
Mp := œ. Q.E.D. 


9.4.11 Theorem The limit of a power series is continuous on the interval of conver- 
gence. A power series can be integrated term-by-term over any closed and bounded 
interval contained in the interval of convergence. 


Proof. If |xo| < R, then the preceding result asserts that X` a,x” converges uniformly on 
any closed and bounded neighborhood of xo contained in (—R, R). The continuity at xp then 
follows from Theorem 9.4.2, and the term-by-term integration is justified by Theorem 
9.4.3. QED. 


We now show that a power series can be differentiated term-by-term. Unlike the 
situation for general series, we do not need to assume that the differentiated series is 
uniformly convergent. Hence this result is stronger than Theorem 9.4.4. 
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9.4.12 Differentiation Theorem A power series can be differentiated term-by-term 
within the interval of convergence. In fact, if 


CO CO 
f(x) = Soar, then f'(x)= X nap"! for |x| <R. 
n=0 n=1 


Both series have the same radius of convergence. 


1/n 


Proof. Since lim(n’/”) = 1, the sequence (|nan| 1/ ") is bounded if and only if the sequence 


(\a,|'/") is bounded. Moreover, it is easily seen that 
lim sup (nan) = lim sup (an|"/”) 


Therefore, the radius of convergence of the two series is the same, so the formally 
differentiated series is uniformly convergent on each closed and bounded interval contained 
in the interval of convergence. We can then apply Theorem 9.4.4 to conclude that the 
formally differentiated series converges to the derivative of the given series. QED. 


Remark Itis to be observed that the theorem makes no assertion about the endpoints of 
the interval of convergence. If a series is convergent at an endpoint, then the differentiated 


o0 
series may or may not be convergent at this point. For example, the series Se [n° 
n=1 
converges at both endpoints x = —1 and x = +1. However, the differentiated series given 


CO 
by 5 x"! /n converges at x = —1 but diverges at x = +1. 
n=1 


By repeated application of the preceding result, we conclude that if k € N then 


Si: a,x" can be differentiated term-by-term k times to obtain 


n=0 oo 
n! 
(5) gpk 
3 (n—k)! 


Moreover, this series converges absolutely to f% (x) for |x| < R and uniformly over any 
closed and bounded interval in the interval of convergence. If we substitute x = 0 in (5), we 
obtain the important formula 


f (0) = Klay. 


9.4.13 Uniqueness Theorem Zf X a,x” and X` b,x" converge on some interval 
(=r, r), r > 0, to the same function f, then 


Gy =b, forall neN. 


Proof. Our preceding remarks show that nla, = f (0) = n!b, for all n € N. QED. 


Taylor Series 


If a function f has derivatives of all orders at a point c in R, then we can calculate the 
Taylor coefficients by ao := f(c), dn := f”)(c)/n! for n € N and in this way obtain a power 
series with these coefficients. However, it is not necessarily true that the resulting power series 
converges to the function fin an interval about c. (See Exercise 12 for an example.) The issue of 
convergence is resolved by the remainder term R,, in Taylor’s Theorem 6.4.1. We will write 


œ ¢(n) c 
(6) fa) = PEO — o)" 
n=0 . 


n 
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for |x — c| < Rif and only if the sequence (R, (x)) of remainders converges to 0 for each x 
in some interval {x : |x — c| < R}. In this case we say that the power series (6) is the 
Taylor expansion of f at c. We observe that the Taylor polynomials for f discussed in 
Section 6.4 are just the partial sums of the Taylor expansion (6) of f. (Recall that 0! := 1.) 


9.4.14 Examples (a) If f(x) := sin x, x € R, we have f °”(x) = (—1)” sin x and 
f Px) = (-1)" cos x for n €N, x ER. Evaluating at c = 0, we get the Taylor 
coefficients a2, = 0 and ay, = (—1)"/(2n + 1)! forn € N. Since |sin x| < 1 and |cos x| < 
1 for all x, then |R,(x)| < |x|"/n! for n € N and x € R. Since lim(R,(x)) = 0 for each 
x € R, we obtain the Taylor expansion 


[0.0] Ej n 
sin x = ae forall xER. 


An application of Theorem 9.4.12 gives us the Taylor expansion 


[0.6] =] n 
cos x = 2 ea xen forall xER. 


(b) If g(x) := e*,x € R, then g(x) = e* forall n € N, and hence the Taylor coefficients 
are given by a, = 1/n! for n € N. For a given x € R, we have |R,(x)| < e!*!|x|"/n! and 
therefore (R,,(x)) tends to 0 as n — oo. Therefore, we obtain the Taylor expansion 


< 1 
(7) e= Sox" forall xeER. 


We can obtain the Taylor expansion at an arbitrary c € R by the device of replacing x by 
x — c in (7) and noting that 


Co 1 Co 
e = eer = Cs ee (x-c)"= pee: —c)" for xER. 


Exercises for Section 9.4 


1. Discuss the convergence and the uniform convergence of the series X` f„, where f,,(x) is given 


by: 

@) +e)’, O) (nx)? (x #0), 

(c) sin(x/n), (d) (x"+1)! (x#0), 

(e) x"/(x"+1) (x>0), O (—1)"(n+x)7' (x> 0). 


2. If Soa, is an absolutely convergent series, then the series Y` a, sinnx is absolutely and 
uniformly convergent. 

3. Let (c,) be a decreasing sequence of positive numbers. If X- c, sin nx is uniformly convergent, 
then lim(nc,) = 0. 

4. Discuss the cases R = 0, R = +00 in the Cauchy-Hadamard Theorem 9.4.9. 


5. Show that the radius of convergence R of the power series X` a,x” is given by lim(|@,/an41]) 
whenever this limit exists. Give an example of a power series where this limit does not exist. 


6. Determine the radius of convergence of the series ` a,x", where a, is given by: 
(a) 1/n", (b) n*/n!, 
(©) n/n, (© (na), n>2, 
(e) (n)? /(2n)!, (f) n", 


10. 
11. 


12. 


13. 


14. 


15. 
16. 


17. 


18. 


19. 


20. 
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If a, := 1 when n is the square of a natural number and an := 0 otherwise, find the radius of 
convergence of X` a,x". If b, := 1 when n = m! for m € N and b, := 0 otherwise, find the 
radius of convergence of the series X` b,x”. 


Prove in detail that lim sup(|ran|'/") = lim sup(|dn|'/”). 
If 0 <p < |a,| < q for all n € N, find the radius of convergence of X` anx” 
Let f(x) = X anx” for |x| < R. If f(x) =f (—x) for all |x| < R, show that a, = 0 for all odd n. 


Prove that if fis defined for |x| < r and if there exists a constant B such that | f™ (x)| < B for all 
|x| < r and n € N, then the Taylor series expansion 


= f™(0) 
D 
n=0 


converges to f(x) for |x| < r. 


Prove by Induction that the function given by f(x) := e7!/° for x40, f(0):=0, has 
derivatives of all orders at every point and that all of these derivatives vanish at x = 0. Hence 
this function is not given by its Taylor expansion about x = 0. 


Give an example of a function that is equal to its Taylor series expansion about x = 0 for 
x > 0, but is not equal to this expansion for x < 0. 


Use the Lagrange form of the remainder to justify the general Binomial Expansion 


TED for O<x<1. 


n=0 


(Geometric series) Show directly that if |x| < 1, then 1/(1 — x) =o. 


Show by integrating the series for 1/(1 + x) that if |x| < 1, then me 
1 
— J (—1)"7 x” 
n=1 n l 
oo 1)” 
Show that if |x| < 1, then Arctan x = iad 
mao 2 n+l 
es] -3 -1) 2n+1 
how that if 1, then Arcsi . ; 
Show that if |x| < 1, then Arcsin x = yi EA EIN miT 


n=0 


x 
é š < _ 72 
Find a series expansion for 1 e" dtforx €R. 
0 


a -1/2 
If a € Rand |k| < 1, the integral F(a,k) := Í (1 — K(sinx)}?) dx is called an elliptic 
integral of the first kind. Show that p 


m \ a (13 (2n 1)\? o 
F(3.4) > ( eee? ya for |k| <1. 


CHAPTER 10 


THE GENERALIZED 
RIEMANN INTEGRAL 


In Chapter 7 we gave a rather complete discussion of the Riemann integral of a function on 
a closed bounded interval, defining the integral as the limit of Riemann sums of the 
function. This is the integral (and the approach) that the reader met in calculus courses; it is 
also the integral that is most frequently used in applications to engineering and other areas. 
We have seen that continuous and monotone functions on [a, b] are Riemann integrable, so 
most of the functions arising in calculus are included in its scope. 

However, by the end of the nineteenth century, some inadequacies in the Riemann 
theory of integration had become apparent. These failings came primarily from the fact that 
the collection of Riemann integrable functions became inconveniently small as mathematics 
developed. For example, the set of functions for which the Newton—Leibniz formula, 


b 
[ F =FO-FO, 
holds does not include all differentiable functions. Also, limits of sequences of Riemann 
integrable functions are not necessarily Riemann integrable. These inadequacies led others 
to invent other integration theories, the best known of which was due to Henri Lebesgue 
(1875-1941) and was developed at the very beginning of the twentieth century. (For an 
account of the history of the development of the Lebesgue integral, the reader should 
consult the book by Hawkins given in the References.) 

Indeed, the Lebesgue theory of integration has become pre-eminent in contemporary 
mathematical research, since it enables one to integrate a much larger collection of 
functions, and to take limits of integrals more freely. However, the Lebesgue integral also 
has several inadequacies and difficulties: (1) There exist functions F that are differentiable 
on [a, b] but such that F’ is not Lebesgue integrable. (2) Some “improper integrals,” such 
as the important Dirichlet integral, 


do not exist as Lebesgue integrals. (3) Most treatments of the Lebesgue integral have 
considerable prerequisites and are not easily within the reach of an undergraduate student 
of mathematics. 

As important as the Lebesgue integral is, there are even more inclusive theories of 
integration. One of these was developed independently in the late 1950s by the Czech 
mathematician Jaroslav Kurzweil (b. 1926) and the English mathematician Ralph 
Henstock (b. 1923). Surprisingly, their approach is only slightly different from that 
used by Riemann, yet it yields an integral (which we will call the generalized Riemann 
integral ) that includes both the Riemann and the Lebesgue integrals as special cases. Since 
the approach is so similar to that of Riemann, it is technically much simpler than the usual 
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Ralph Henstock and Jaroslav Kurzweil 
Ralph Henstock (1923-2007), pictured on the left, was 
born in Nottinghamshire, England, the son of a mine- 
worker. At an early age he showed that he was a gifted 
scholar in mathematics and science. He entered St. John’s 
College, Cambridge, in 1941, studying with J. D. Bernal, 
G. H. Hardy, and J. C. Burkhill and was classified 
Wrangler in Part II of the Tripos Exams in 1943. He 
earned his B.A. at Cambridge in 1944 and his Ph.D. at the 
University of London in 1948. His research is in the theory 
of summability, linear analysis, and integration theory. 
Most of his teaching was in Northern Ireland. 


Jaroslav Kurzweil (pictured on the right) was born on May 7, 1926, in Prague. A student of 
V. Jarnik, he has done a considerable amount of research in the theory of differential equations and 
the theory of integration, and also has had a serious interest in mathematical education. In 1964 he 
was awarded the Klement Gottwald State Prize, and in 1981 he was awarded the Bolzano medal 
of the Czechoslovak Academy of Sciences. Since 1989 he has been Director of the Mathematical 
Institute of the Czech Academy of Sciences in Prague and has had a profound influence on the 
mathematicians there. In 2006, he was awarded the “Czech Mind,” the highest scientific prize of 
the Czech Republic. 


Lebesgue integral—yet its scope is considerably greater; in particular, it includes functions 
that are derivatives, and also includes all “improper integrals.” 

In this chapter, we give an exposition of the generalized Riemann integral. In Section 
10.1, it will be seen that the basic theory is almost exactly the same as for the ordinary 
Riemann integral. However, we have omitted the proofs of a few results when their proofs 
are unduly complicated. In the short Section 10.2, we indicate that improper integrals on 
[a, b] are included in the generalized theory. We will introduce the class of Lebesgue 
integrable functions as those generalized integrable functions f whose absolute value |f| is 
also generalized integrable; this is a very different approach to the Lebesgue integral than is 
usual, but it gives the same class of functions. In Section 10.3, we will integrate functions 
on unbounded closed intervals. In the final section, we discuss the limit theorems that hold 
for the generalized Riemann and Lebesgue integrals, and we will give some interesting 
applications of these theorems. We will also define what is meant by a “measurable 
function” and relate that notion to generalized integrability. 

Readers wishing to study the proofs that are omitted here should consult the first 
author’s book, A Modern Theory of Integration, which we refer to as [MTI], or the books of 
DePree and Swartz, Gordon, and McLeod listed in the References. 


Section 10.1 Definition and Main Properties 


In Definition 5.5.2, we defined a gauge on [a, b] to be a strictly positive function ô : [a,b] > 
(0, oo). Further, a tagged partition P := { (Ji, ti) }4_, of [a, b], where I; := [x;-1, xi], is said 
to be 6-fine in the case 


(1) ti€ 1; C |ti slt), ti + 6(t)] for i=1,...,n. 
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This is shown in Figure 5.5.1. Note that (i) only a tagged partition can be ô-fine, and (ii) the 
6-fineness of a tagged partition depends on the choice of the tags t; and the values &(t;). 

In Examples 5.5.4, we gave some specific examples of gauges, and in Theorem 5.5.5 
we showed that if 6 is any gauge on [a, b], then there exist 5-fine tagged partitions of [a, b]. 

We will define the generalized Riemann (or the ““Henstock—Kurzweil’’) integral. It 
will be seen that the definition is very similar to that of the ordinary Riemann integral, and 
that many of the proofs are essentially the same. Indeed, the only difference between the 
definitions of these integrals is that the notion of smallness of a tagged partition is specified 
by a gauge, rather than its norm. It will be seen that this—apparently minor—difference 
results in a very much larger class of integrable functions. In order to avoid some 
complications, a few proofs will be omitted; they can be found in [MTI]. 

Before we begin our study, it is appropriate that we ask: Why are gauges more useful 
than norms? Briefly, the reason is that the norm of a partition is a rather coarse measure of 
the fineness of the partition, since it is merely the length of the largest subinterval in the 
partition. On the other hand, gauges can give one more delicate control of the subintervals 
in the partitions, by requiring the use of small subinterals when the function is varying 
rapidly but permitting the use of larger subintervals when the function is nearly constant. 
Moreover, gauges can be used to force specific points to be tags; this is often useful when 
unusual behavior takes place at such a point. Since gauges are more flexible than norms, 
their use permits a larger class of functions to become integrable. 


10.1.1 Definition A function f : [a,b] — R is said to be generalized Riemann integra- 
ble on [a, b] if there exists a number L € R such that for every e > 0 there exists a gauge 6, 
on [a, b] such that if P is any ô,-fine partition of [a, b], then 


IS(f;P) — L| < e. 


The collection of all generalized Riemann integrable functions will usually be denoted by 
R*[a, b]. 


It will be shown that if f € R*[a, b], then the number L is uniquely determined; it will 
be called the generalized Riemann integral of f over [a, b]. It will also be shown that if 
f € Ria, b], then f € R* [a,b] and the value of the two integrals is the same. Therefore, it 
will not cause any ambiguity if we also denote the generalized Riemann integral of f € 


R*[a, b] by the symbols 
b b 
f f or / f(x)dx. 


Our first result gives the uniqueness of the value of the generalized Riemann integral. 
Although its proof is almost identical to that of Theorem 7.1.2, we will write it out to show 
how gauges are used instead of norms of partitions. 


10.1.2 Uniqueness Theorem Jff € R*[a,b], then the value of the integral is uniquely 
determined. 


Proof. Assume that L’ and L” both satisfy the definition and let ¢ > 0. Thus there exists a 
gauge ô, such that if Pı is any 5,,)-fine partition, then 


IS(f:P1) —L| < e/2. 
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Also there exists a gauge ô, 2/2 such that if P3 is any ô, Ws -fine partition, then 
IS(f:P2) —L"| < e/2. 


We define ô, by 6,(t) := min{ 5,/9(¢ ) Solt ) > for t € [a, b], so that 5, is a gauge on [a, b]. 
If P is a ô,-fine partition, thèn the partition P is both ô, -fine and ô -fine, so that 


IS(f:P) -L'|<e/2 and |S(f;P) —L"| < ¢/2, 
whence it follows that 


IL = L"| < E — S(fsP)| + |S(fsP) - 2" 
<é/2+6/2=6. 


Since € > 0 is arbitrary, it follows that L’ = L”. Q.E.D. 


We now show that every Riemann integrable function fis also generalized Riemann 
integrable, and with the same value for the integral. This is done by using a gauge that is a 
constant function. 


10.1.3 Consistency Theorem Jff € R|a, b] with integral L, then also f € R* |a, b] with 
integral L. 


Proof. Given ¢ > 0, we need to construct an appropriate gauge on [a, b]. Since 
f € R|a,b], there exists a number 6, > 0 such that if P is any tagged partition with 
IPI] < 6,, then IS(f; P) — L| < e. We define the function 5;(t) := +4, for t € [a, b], so that 
ô; is a gauge on [a, b]. 

f P= {(/i, ti) i1 where I; := [x;-1, xi], is a &-fine partition, then since 


L C [ti = 83(ti), ti + t] = [ti — 980, ti + gô], 


it is readily seen that 0 < x; — xi-1 < 18, < 6, forall i = 1,...,n. Therefore this partition 
also satisfies \|P || < ô; and consequently |S( (f: P) — L| <E. 


Thus every 6;-fine partition P also satisfies IS(F : P) - L| <e. Since ¢ > 0 is 
arbitrary, it follows that f is generalized Riemann integrable to L. QED. 


From Theorems 7.2.5, 7.2.7, and 7.2.8, we conclude that: Every step function, every 
continuous function, and every monotone function belongs to R* |a, b|. We will now show 
that Dirichlet’s function, which was shown not to be Riemann integrable in 7.2.2(b) and 
7.3.13(d), is generalized Riemann integrable. 


10.1.4 Examples (a) The Dirichlet function f belongs to R*[0, 1] and has integral 0. 
We enumerate the rational numbers in [0, 1] as {r;}7,. Given € > 0 we define 5,(rx) 
= ¢/2'*? and 8,(x) := 1 when x is irrational. Thus 6, is a gauge on [0,1] and if the 
partition P := { (Ti, ti) }"_, is 6,-fine, then we have x; — x;-1 < 28,(t;). Since the only 
nonzero contributions to S ( f; P) come from rational tags t; = rg, where 


2e E 
0 < f(r) (xi — Xi-1) = 1+ (x; — xiz1) < Dee T? 


292 CHAPTER 10 THE GENERALIZED RIEMANN INTEGRAL 


and since each such tag can occur in at most two subintervals, we have 


. <L 2e L e 
0<S(f:P) < 2 na a e. 
k=l k=l 
Since € > 0 is arbitrary, then f € R*[0, 1] and hf =0. 
(b) LetH : [0,1] — R be defined by H(1/k) := k for k € N and H(x) := 0 elsewhere on 
[0, 1]. 


Since H is not bounded on [0, 1], it follows from the Boundedness Theorem 7.1.6 that 
it is not Riemann integrable on [0, 1]. We will now show that H is generalized Riemann 
integrable to 0. 

In fact, given e > 0, we define 5,(1/k) := ¢/(k2***) and set 6,(x) := 1 elsewhere on 
[0, 1], so 6, is a gauge on [0, 1]. If P is a d,-fine partition of [0, 1] then x; — x;-1 < 26,(t;). 
Since the only nonzero contributions to S(H; P) come from tags t; = 1/k, where 


2e E 


0 < H(1/K)(xi = xe) = k (xi = x-1) Sk ee = ET 


and since each such tag can occur in at most two subintervals, we have 


0<S(H;P) <<) 5 =e. 
kal 2 


Since € > 0 is arbitrary, then H € R*[0, 1] and h H=0. 


The next result is exactly similar to Theorem 7.1.5. 


10.1.5 Theorem Suppose that f and g are in R* [a,b]. Then: 
(a) Ifk ER, the function kf is in R* [a,b] and 


[nels 


(b) The function f + g is in R*|a,b| and 


[oros f rfs 


(c) If f(x) < g(x) for all x € [a,b], then 


fiefs 


Proof. (b) Given e > 0, we can use the argument in the proof of the Uniqueness Theorem 
10.1.2 to construct a gauge 6, on [a, b] such that if P is any 6,-fine partition of [a, b], then 


sr) - Ja 


a 


b 
< €/2 and ste?) -f | < 6/2. 


a 
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Since S(f + 3:7 P) = S(f; P) + S(g ; ;P), it follows as in the proof of Theorem 7.1.5(b) that 


sersa- (r+ epen- Sake- [4 


<6/2+6/2=6. 
Since ¢ > Ois arbitrary, then f + g € R* |a, b] and its integral is the sum of the integrals of f 
and g. 
The proofs of (a) and (c) are analogous and are left to the reader. Q.E.D. 


It might be expected that an argument similar to that given in Theorem 7.1.6 can be 
used to show that a function in R* [a,b] is necessarily bounded. However, that is not the 
case; indeed, we have already seen an unbounded function in R*[0, 1] in Example 10.1.4(b) 
and will encounter more later. However, it is a profitable exercise for the reader to 
determine exactly where the proof of Theorem 7.1.6 breaks down for a function in R*[a, b]. 


The Cauchy Criterion 


There is an analogous form for the Cauchy Criterion for functions in R*[a,b]. It is 
important because it eliminates the need to know the value of the integral. Its proof is 
essentially the same as that of 7.2.1. 


10.1.6 Cauchy Criterion A function f : [a,b] — R belongs to R*[a, b] if and only if for 
every & > 0 there exists a gauge n, on |a, b] such that if P and Q are any partitions of 
[a, b] that are n,-fine, then 


ISP) = s(f30)| <a 
Proof. (=)Iff € R*[a, b] with integral L, let 5,/2 be a gauge on [a, b] such that if P and Q 
are 5,/-fine partitions of [a, b], then 
IS(F;P) -L| <e/2 and |S(f;Q) —L| < ¢/2. 
We set ,(t) := ,/2(t) for t € [a,b], so if P and Q are n,-fine, then 
IS(fsP) — S(f:2)| < IS(fsP) — L + |L— S(f:9)| 
<¢6/2+6/2=6. 


(<) For each n € N, let 5, be a gauge on [a, b] such that if P and Q are partitions that 
are 6,-fine, then 


|S(f:P) — S(f:2)| < 1/n. 


We may assume that 6,(¢) > 5,+1(¢) for all ¢ € [a,b] and n € N; otherwise, we replace 6, 
by the gauge 6),(¢) = min{5)(f),...,6,(¢)} for all t € [a,b]. 

For each n € N, let P, be a partition that is 6,-fine. Clearly, if m > n then both Pm and 
P, are 6,-fine, so that 


(2) |S(f:Pn) — S(fsPm)| <1/n for m>n. 


Consequently, the sequence (S ( Í; Pm) ) m-i is a Cauchy sequence in R, so it converges to 
some number A. Passing to the limit in (2) as m — oo, we have 


|S(fsP,) -A| <1/n foral =n eN. 
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To see that A is the generalized Riemann integral of f, given e > 0, let K € N satisfy 
K > 2/e. If Q is a 5,-fine partition, then 


IS 2) -A| < [S Q) = S(FPx)] + [S(FPx) — A| 
<1/K+1/K <e. 
Since € > 0 is arbitrary, then f € R*|a, b] with integral A. QED. 


10.1.7 Squeeze Theorem Let f : [a,b] — R. Then f € R*|a,b] if and only if for every 
e > 0 there exist functions a, and œw, in R*|a, b] with 


a(x) < f(x) < olx) forall x € [a,b], 


b 
/ (w, TN Qg) < E. 
a 


The proof of this result is exactly similar to the proof of Theorem 7.2.3, and will be left 
to the reader. 


and such that 


The Additivity Theorem 


We now present a result quite analogous to Theorem 7.2.9. Its proof is a modification of the 
proof of that theorem, but since it is somewhat technical, the reader may choose to omit the 
proof on a first reading. 


10.1.8 Additivity Theorem Let f : [a,b] — R and let c € (a,b). Then f € R* [a,b] if 
and only if its restrictions to |a, c| and |c, b] are both generalized Riemann integrable. In 
this case 


6) f T f f+ i s 


Proof. (<=) Suppose that the restriction f, of fto [a, c] and the restriction f of f to [c, b] are 
generalized Riemann integrable to L; and Ly, respectively. Then, given ¢ > 0 there exists a 
gauge 6’ on [a, c] such that if P; is a 5’-fine partition of [a, c] then IS(f1;P1) - Lil < é/2. 
Also there exists a gauge ô” on [c, b] such that if P2 is a ô"-fine partition of [c, b] then 
|S(fo;P2) — L2| < €/2. 

We now define a gauge ô; on [a, b] by 


min{é’(t),5(¢ — t)} for t € [a,c), 
6,(t) := 4 min{ô' (c), MO} for t=c, 
min{8”(t),4(c — t)} for t € (c, 5]. 


(This gauge has the property that any 6,-fine partition must have c as a tag for any 
subinterval containing the point c.) 

We will show that if Q is any 4,-fine partition of [a, b], then there exist a 6'-fine 
partition Qi of [a, c] and a 5”-fine partition Q of [c, b] such that 


(4) S(f;Q) = S(f1; Q1) + S(f2; Q2). 


Case (i) If c is a partition point of Q, then it belongs to two subintervals of Q and is the 
tag for both of these subintervals. If Q, consists of the part of Q having subintervals in 
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[a, c], then Ò; is ô'-fine. Similarly, if Ò, consists of the part of Q having subintervals in 
[c, b], then Q> is 5”-fine. The relation (4) is now clear. 

Case (ii) If c is not a partition point in Q = {(J;,t;)}”,, then it is the tag for some 
subinterval, say [x,—1, Xx]. We replace the pair ([xx_1, xx], €) by the two pairs ([xx_1, c], c) 
and (|c, xx], c), and let Q, and Q; be the tagged partitions of [a, c] and [c, b] that result. 
Since f(c)(x% — xk-1) =f (c)(¢ — xk-1) + f(c)(xk — c), it is seen that the relation (4) 
also holds. 

In either case, equation (4) and the Triangle Inequality imply that 


[S(f;Q) — (Li + La)| = (SF; Q1) + SCF; Q2)) — (Li + Ly) | 
< |S(f:Q1) — Lil + |S(f;Q2) — L]. 


Since È; is 6’-fine and Q; is 5”-fine, we conclude that 


ISF; Q) — (Lı + L2)| < e. 


Since ¢ > 0 is arbitrary, we infer that f € R*|a, b] and that (3) holds. 

(=) Suppose that f € R* [a,b] and, given ¢ > 0, let the gauge 7, satisfy the Cauchy 
Criterion. Let f4 be the restriction of f to [a, c] and let Pi, Ò; be n,-fine partitions of [a, c]. 
By adding additional partition points and tags from [c, b], we can extend Pı and Q; to 
n,-fine partitions P and Q of [a, b]. If we use the same additional points and tags in [c, b] 
for both P and Q, then 


S(f:P) — SCF; Q) = S Pi) — Si Q1). 


Since both P and Q are n,-fine, then |S(f1; P1) — S(f1; Q1)| < £ also holds. Therefore the 
Cauchy Condition shows that the restriction f, of f to [a, c] is in R*[a, c]. Similarly, the 
restriction f, of f to [c, d] is in R*[c, d]. 

The equality (3) now follows from the first part of the theorem. QED. 


It is easy to see that results exactly similar to 7.2.10—7.2.13 hold for the generalized 
Riemann integral. We leave their statements to the reader, but will use these results freely. 


The Fundamental Theorem (First Form) 


We will now give versions of the Fundamental Theorems for the generalized Riemann 
integral. It will be seen that the First Form is significantly stronger than for the (ordinary) 
Riemann integral; indeed, we will show that the derivative of any function automatically 
belongs to R*[a, b], so the integrability of the function becomes a conclusion, rather than a 
hypothesis. 


10.1.9 The Fundamental Theorem of Calculus (First Form) Suppose there exists a 
countable set E in (a, b], and functions f, F : [a,b] — R such that: 


(a) F is continuous on [a, b]. 
(b) F'(x) =f(x) for all x € |a, b|\E. 
Then f belongs to R*|a, b] and 


b 
(5) / f = F(b) — F(a). 


Proof. We will prove the theorem in the case where E = ), leaving the general case to be 
handled in the exercises. 
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Thus, we assume that (b) holds for all x € [a,b]. Since we wish to show that 
f € R*[a,b], given ¢ > 0, we need to construct a gauge 6,; this will be done by using 
the differentiability of F on [a, b]. If t € I, since the derivative f(t) = F’(t) exists, there 
exists 5,(¢) > 0 such that if 0 < |z — t| < 6,(¢),z € [a,b], then 


F(z) — F(t) 1 
eee Oy (2 eee 
mr AOS 5 
If we multiply this inequality by |z — t|, we obtain 
|F(z) — F(t) -FAE - t) < elz - t 
whenever z € [t — ô:(t), t + ô:(t)] M [a,b]. The function 6, is our desired gauge. 
Now let u, v € [a, b] with u < v satisfy t € [u, v] C [t — 6,(¢), t + 4,(2)]. If we subtract 


and add the term F(t) — f(t) - t and use the Triangle Inequality and the fact that v — t > 0 
and t—u > 0, we get 


Therefore, if t € [u, v] C [t — 6,(t),¢+ 6,(2)], then we have 


(6) |F(v) — F(u) —f()(v —u)| < 3e -= u). 
We will show that f € R*[a, b] with integral given by the telescoping sum 
(7) F(b) — F(a) = X {F (xi) — F(xi-1)}- 
i=l 


For, if the partition P := {([x;1, xj], 4) 4, is 6,-fine, then 


ti€ [x-1 x;] C [ti = 6.(ti), ti + ôe(ti)] for i = 1, <” M, 


and so we can use (7), the Triangle Inequality, and (6) to obtain 


Since € > 0 is arbitrary, we conclude that f € R*[a, b] and (5) holds. Q.E.D. 


10.1.10 Examples (a) IfH(x) := 2,/x for x € [0, b], then H is continuous on [0, b] and 
H' (x) = 1/y/x for x € (0,b]. We define A(x) := H'(x) for x € (0,b] and A(0) := 0. It 
follows from the Fundamental Theorem 10.1.9 with E := {0} that / belongs to R*[0, b] 
and that fon = H(b) — H(0) = H(b), which we write as 


b 
1 
—dx = 2v/b. 
ee 
(b) More generally, if œ > 0, let Ha(x) := x%/a = e*"* /æ for x € (0, b] and H,(0) := 0 
so that Hy is continuous on [0, b] and H’, (x) = x%~! for all x € (0,5); see 8.3.10 and 
8.3.13. We define ha(x) := H4(x) for x € (0, b] and h,(0) := 0. 
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Then Theorem 10.1.9 implies that hy € R*[0, b] and that f ha = Ha(b) — Ha (0) = 
H,(b), which we write as 
b a 
J x“ ldx = E 
0 a 


(c) Let L(x) := xInx — x for x € (0,5] and L(0) := 0. Then L is continuous on [0, b] 
(use l’Hospital’s Rule at x = 0), and it is seen that L'(x) = Inx for x € (0, b]. 

It follows from Theorem 10.1.9 with E = {0} that the unbounded function /(x) := Inx 
for x € (0, b] and/(0) := 0 belongs to R* (0, b] and that f? l = L(b) — L(0), which we write as 


b 
| Inxdx = blnb —b. 
0 
(d) Let A(x) := Arcsin x for x € [—1, 1] so that A is continuous on {[—1, 1] and A’(x) = 
1/V1—x? for x€(-1,1). We define s(x) :=A'(x) for x€(—1,1) and let 
s(—1) = s(1) := 0. 

Then Theorem 10.1.9 with E = {—1,1} implies that s € R*[—1,1] and that 
J s = A(1) — A(—1) = x, which we write as 


= Arcsin | — Arcsin(—1) = z. 


i dx 
-1 V1 — x2 


The Fundamental Theorem (Second Form) 


We now turn to the Second Form of the Fundamental Theorem, in which we wish to 
differentiate the indefinite integral F of f, defined by: 


(8) F(z) := f tax for z€ [a, 5]. 


10.1.11 Fundamental Theorem of Calculus (Second Form) Let f belong to R*[a,b] 
and let F be the indefinite integral of f. Then we have: 

(a) F is continuous on [a, b]. 

(b) There exists a null set Z such that if x € {a,b|\Z, then F is differentiable at x and 
P(x) = f(x). 

(© If f is continuous at c € [a,b], then F'(c) = f(c). 


Proof. The proofs of (a) and (b) can be found in [MTI]. The proof of (c) is exactly as the 
proof of Theorem 7.3.5 except that we use Theorems 10.1.5(c) and 10.1.8. Q.E.D. 


We can restate conclusion (b) as: The indefinite integral F of f is differentiable to f 
almost everywhere on [a, b]. 


Substitution Theorem 


In view of the simplicity of the Fundamental Theorem 10.1.9, we can improve the theorem 
justifying the “substitution formula.” The next result is a considerable strengthening of 
Theorem 7.3.8. The reader should write out the hypotheses in the case Ey = Ep = E = 0). 


10.1.12 Substitution Theorem (a) Let / := [a,b] and J := [a, f], and let F : I — R 
and y : J — R be continuous functions with (J) C 1. 
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(b) Suppose there exist sets Ep C I and Eg C J such that f(x) = F'(x) for x € I\Ey, 
that g(t) exists for t € J\Ey, and that E := g`! (Ef) U Eg is countable. 


(c) Set f(x) := 0 for x € Ey and g'(t) :=0 for t € Eg. 
We conclude that f © R*(y(J)), that (f o p) -Ø € R*(J) and that 


B (b) 
% Jola) 


Proof. Since ¢ is continuous on J, Theorem 5.3.9 implies that g(J) is a closed interval in 
I. Also g`! (Ep) is countable, whence Ep N (J) = g(y | (Ef)) is also countable. Since 
f(x) = F'(x) for all x € g(J)\E;, the Fundamental Theorem 10.1.9 implies that f € 
Y 
w= Fol) — Fela). 


R*(gy(J)) and that 
m 
f=F 
ola) (a) 


If t € J\E, then t € J\E, and g(t) € J\Ey. Hence the Chain Rule 6.1.6 implies that 


(Fog) (t) =F) PA) for te J\E. 
Since E is countable, the Fundamental Theorem implies that (f o g) - g! € R*(J) and that 


b 
(9) [ veo g =Foy 


(8) 


B 
[ (F00) = Feel = Fi) - Fiola). 


The conclusion follows by equating these two terms. QED. 


4 
cosy dt. 

o vi 

Since the integrand is unbounded as t — 0+, there is some doubt about the existence 
of the integral. Also, we have seen in Exercise 7.3.19(b) that Theorem 7.3.8 does not apply 
with g(t) := yt. However, Theorem 10.1.12 applies. 

Indeed, this substitution gives g'(t) = 1/(2,/t) for t € (0, 4] and we set g’(0) := 0. If 
we put F(x) := 2sin x, then f(x) = F'(x) = 2cos x and the integrand has the form 


10.1.13 Examples (a) Consider the integral 


FCH) - ¢'(t) = (2cosyt) (z) for t#0. 


Thus, the Substitution Theorem 10.1.12 with E, := {0}, Ey := Ø, E := {0} implies that 


t=4 x=2 

t 

| Var f 2cos x dx = 2sin2. 
t=0 vt x=0 


dt f dt 
Vt- Jo VVi-t 
Note that this integrand is unbounded as t — 0+ and as t — 1—. As in (a), we let 
x= g(t) := vt for t€ [0,1] so that g(t) = 1/(2vf) for te (0,1]. Since v1 -— t = 
V1 = x?, the integrand takes the form 
2 Io 2 7 
Aar y 


1 
(b) Consider the integral | 
0 


(1), 
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which suggests f(x) = 2/v 1 — x? for x £ 1. Therefore, we are led to choose F(x) := 
2 Arcsin x for x € [0, 1], since 


2 
vl- x? 
Consequently, we have E, = {0} and Ey = {1}, so that E = {0, 1}, and the Substitution 
Theorem yields 


= F'(x) = (2 Arcsin x) for x € (0,1). 


x=1 
2d 1 
as = 2 Arcsin x| = 2 Arcsin 1 = x. 
0 


=l dt 
5 Jtv1 t Jxo VI—x2 


Other formulations of the Substitution Theorem are given in [MTI]. 


The Multiplication Theorem 


In Theorem 7.3.16 we saw that the product of two Riemann integrable functions is 
Riemann integrable. That result is not true for generalized Riemann integrable functions; 
see Exercises 18 and 20. However, we will state a theorem in this direction that is often 
useful. Its proof will be found in [MTI]. 


10.1.14 Multiplication Theorem Zf f € R*[a,b] and if g is a monotone function on 
[a, b], then the product f - g belongs to R* [a,b]. 


Integration by Parts 


The following version of the formula for integration by parts is useful. 


10.1.15 Integration by Parts Theorem Let F and G be differentiable on [a, b]. Then 
F'G belongs to R*|a,b] if and only if FG' belongs to R*|a,b]. In this case we have 


b b 
- | Fe. 
a a 


The proof uses Theorem 6.1.3(c); it will be left to the reader. In applications, we 
usually have F’(x) = f(x) and G’(x) = g(x) for all x € [a, b]. It will be noted that we need 
to assume that one of the functions fG = F'G and Fg = FG’ belongs to R*[a, b]. 

The reader should contrast the next result with Theorem 7.3.18. Note that we do not 


need to assume the integrability of f GEN: 


(10) f ro=ro 


a 


10.1.16 Taylor’s Theorem Suppose that f, f', f",..., f and f"*” exist on [a, b]. 
Then we have 


(11) OO Ce ee 


where the remainder is given by 


b 
(12) R, = F FDA) ; (b = t)"dt. 


Proof. Since f\"*" is a derivative, it belongs to R*[a, b]. Moreover, since t +> (b — t)” is 
monotone on [a, b], the Multiplication Theorem 10.1.14 implies the integral in (12) exists. 
Integrating by parts repeatedly, we obtain (11). QED. 
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Exercises for Section 10.1 


10. 


11. 


Let ô be a gauge on [a, b] and let P = {([xi-1, xi], t:)}"_, be a 6-fine partition of [a, b]. 

(a) Show that 0 < x; — xi- < 26(t) fori=1,...,n. 

(b) Tf 5* := sup{8(t) : t € [a,]} < 00, show that ||P]| < 26°. 

(c) If 6, := inf{8(t) : t € [a,b]} satisfies 5, > 0, and if Q is a tagged partition of [a, b] such 
that we have || Q| < 5,, show that Q is 6-fine. 

(d) If ¢ = 1, show that the gauge ô; in Example 10.1.4(a) has the property that inf{6,(t) : t € 
(0, 1]} = 0. 


(a) If P is a tagged partition of [a, b], show that each tag can belong to at most two subintervals 
in P. 

(b) Are there tagged partitions in which every tag belongs to exactly two subintervals? 

Let ô be a gauge on [a, b] and let P bea ô-fine partition of [a, b]. 

(a) Show that there exists a 6-fine partition Q; such that (i) no tag belongs to two subintervals 
in Q4, and (ii) S(f; Q1) = S(f;P) for any function f on [a, b]. 

(b) Does there exist a 5-fine partition Q> such that (j) every tag belongs to two subintervals in 
Q», and Gj) S(f; Q2) = S(f;P) for any function f on [a, b]? 

(c) Show that there exists a ô-fine partition Q; such that (k) every tag is an endpoint of its 
subinterval, and (kk) S(f; Q3) = S(f;P) for any function f on [a, b]. 


If 5 is defined on [0, 2] by (£) := 5|t — 1| for x # 1 and (1) := 0.01, show that every ô-fine 
partition P of [0, 2] has ¢ = 1 as a tag for at least one subinterval, and that the total length of the 
subintervals in P having 1 as a tag is < 0.02. 


(a) Construct a gauge ô on [0, 4] that will force the numbers 1, 2, 3 to be tags of any 6-fine 
partition of this interval. 

(b) Given a gauge 4; on [0, 4], construct a gauge 52 such that every ô2-fine partition of [0, 4] 
will (i) have the numbers 1, 2, 3 in its collection of tags, and (ii) be 6,-fine. 


Show that f € R*|a, b] with integral L if and only if for every € > 0 there exists a gauge y, on 
[a, b] such that if P = {([xi-1, xi], 4;)}/_, is any tagged partition such that 0 < x; — x;-ı < 
y,(t;) for i= 1,...,n, then |S(f;P) — L| < e. (This provides an alternate—but equivalent— 
way of defining the generalized Riemann integral.) 


Show that the following functions belong to R*(0, 1] by finding a function Fx that is continuous 
on [0, 1] and such that Fi. (x) = f(x) for x € [0, 1]\Ex, for some finite set E. 

(a) f,(x) = (x+ 1)/yx forx € (0, 1] andf,(0) := 0. 

b) falx) :=x/vV1=x forx € [0,1) andf,(1) := 0. 

(c) f3(x):= yxlnx forx € (0, 1] andf;(0) := 0. 

(d) f4(x) := (Inx)/V/x for x € (0, 1] and f4(0) := 0. 

©) fs(x) := /4+x)/(1—x) forx e€ [0,1) andf,(1) := 0. 

O félx) = 1/(/xV2— x) for x € (0, 1] and f,(0) := 0. 

Explain why the argument in Theorem 7.1.6 does not apply to show that a function in R*|a, b] is 
bounded. 


Let f(x) := 1/x for x € (0, 1] and f(0) := 0; then fis continuous except at x = 0. Show that f 
does not belong to R*[0, 1]. [Hint: Compare f with s,(x) := 1 on (1/2, 1], s,(x) := 2 on 
(1/3, 1/2], sn(x) := 3 on (1/4, 1/3],...,5n(x) := n on [0, 1/n].] 

Let k : [0,1] — R be defined by k(x) := 0 if x € [0, 1] is 0 or is irrational, and k(m/n) := nif 
m,n € N have no common integer factors other than 1. Show that k € R*[0, 1] with integral 


equal to 0. Also show that k is not continuous at any point, and not bounded on any subinterval 
[c, d] with c < d. 


Let f be Dirichlet’s function on [0, 1] and F(x) := 0 for all x € [0, 1]. Since F’(x) = f(x) for 
all x € [0, 1]\Q, show that the Fundamental Theorem 10.1.9 implies that f € R*(0, 1]. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 
19. 


20. 


21. 


22. 


23. 


24. 
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Let M(x) := In|x| for x 4 0 and M(0) := 0. Show that M'(x) = 1/x for all x 4 0. Explain why 

it does not follow that f0/x)dx = In|—2| — In2 = 0. 

Let Li(x) := xln|x|— x for x#0 and L,(0) := 0, and let /,(x) := In|x| if x #0 and 

1,(0) := 0. If [a, b] is any interval, show that /; € R*[a, b] and that f? ln|x|dx = Lı (b) — Lı (a). 

Let E := {c1, &2, . . .} and let F be continuous on [a, b] and F'(x) = f(x) for x € [a, b]\E and 

f(ck) := 0. We want to show that f € R*[a,b] and that equation (5) holds. 

(a) Givene > Oandt € |a, b]\E, let 5,(t) be defined as in the proof of 10.1.9. Choose ô; (cx) > 
O such that if |z — c| < ôs(cx) and z € [a, b], then |F(z) — F(cx)| < Baggs 

(b) Show that if the partition P is 4,-fine and has a tag t;=c,, then we have 
|E (xi) — F(xi-1) —f (cx) (x1 — xi-1)| < aes 

(c) Use the argument in 10.1.9 to get |S(f;P) — (F(b) — F(a))| < e(b— a+ 1). 

Show that the function g,(x) := x~!/?sin(1/x) for x € (0,1] and g,(0) := 0 belongs to 

R*(0, 1]. [Hint: Differentiate Cı (x) := x°/?cos(1/x) for x € (0, 1] and C;(0) := 0.] 

Show that the function g(x) := (1/x)sin(1/x) for x € (0,1] and g,(0) := 0 belongs to 

R*|0, 1]. [Hint: Differentiate C)(x) := xcos(1/x) for x € (0,1] and C2(0) := 0, and use 

the result for the cosine function that corresponds to Exercise 7.2.12.] 


Use the Substitution Theorem 10.1.12 to evaluate the following integrals: 


3 4 vidt 
2t+1 P+t—2)dt=6, b , 
O f arr nse ae A 
5 dt ae 
= 2 Arctan 2 d 1—P dt. 
(c) era rctan 2, (d) f 


Give an example of a function f € R*[0, 1] whose square f? does not belong to R*[0, 1]. 

Let F(x) := xcos(z/x) for x € (0, 1] and F(0) := 0. It will be seen that f := F’ € R*[0, 1] but 

that its absolute value |f| = |F’| ¢ R*[0, 1]. (Here f(0) := 0.) 

(a) Show that F’ and |F’| are continuous on any interval [c, 1],0 < c < 1 and f € R*(0, 1]. 

(b) If a, :=2/(2k +1) and by :=1/k for k EN, then the intervals [a,b] are non- 
overlapping and 1/k < jr Ifl. 


(c) Since the series 5 1/k diverges, then |f|  R*[0, 1]. 
k=l 
Let f be as in Exercise 19 and let m(x) :=(—1)* for x € [ax, bk](k € N), and m(x) := 0 


elsewhere in [0, 1]. Show that m - f = |m - f|. Use Exercise 7.2.11 to show that the bounded 
functions m and |m| belong to R[O0, 1]. Conclude that the product of a function in R*[0, 1] and a 
bounded function in R[0, 1] may not belong to R*(0, 1]. 

Let ®(x) := x|cos(z/x)| for x € (0, 1] and let (0) := 0. Then ® is continuous on [0, 1] and 
P(x) exists for x ¢ E := {0} U {ap : k € N}, where ap := 2/(2k + 1). Let g(x) := ®'(x) for 
x ¢ E and g(x) := 0 for x € E. Show that gy is not bounded on [0,1]. Using the Fundamental 
Theorem 10.1.9 with E countable, conclude that g € R*[0, 1] and that J gy = (b) — (a) for 
a,b € [0,1]. As in Exercise 19, show that |y| ¢ R*[0, 1]. 

Let W(x) := x?|cos(7/x)| for x € (0, 1] and (0) := 0. Then V is continuous on [0, 1] and 
P(x) exists for x ¢ E; := {az}. Let V(x) := W'(x) for x ¢ Ey, and y(x) := 0 for x € Ej. 
Show that y is bounded on [0, 1] and (using Exercise 7.2.11) that y € R[0, 1]. Show that 
f? y = V(b) — V(a) for a,b € [0, 1]. Also show that |y| € R[O, 1]. 

If f : [a,b] — R is continuous and if p € R*[a,b] does not change sign on [a, b], and if 
fp € R*|a, 5], then there exists £ € [a, b] such that fp =f(é) fp. (This is a generalization of 
Exercise 7.2.16; it is called the First Mean Value Theorem for integrals.) 


Let f € R*/a, b], let g be monotone on [a, b] and suppose that f > 0. Then there exists £ € [a, b] 
such that ffe = g(a) EF + g(b) ce (This is a form of the Second Mean Value Theorem for 


integrals.) 
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Section 10.2 Improper and Lebesgue Integrals 


We have seen in Theorem 7.1.6 that a function f in R[a, b] must be bounded on [a, b] 
(although this need not be the case for a function in R* |a, b]). In order to integrate certain 
functions that have infinite limits at a point c in [a, b], or that are highly oscillatory at such a 
point, one learns in calculus to take limits of integrals over subintervals, as the endpoints of 
these subintervals tend to the point c. 

For example, the function A(x) := 1/,/x for x € (0, 1] and /(0) := 0 is unbounded on 
a neighborhood of the left endpoint of [0, 1]. However, it does belong to R[y, 1] for every 
y € (0, 1] and we define the “improper Riemann integral” of h on [0, 1] to be the limit 


am | vA 
| a = tig, f ger 
We would treat the oscillatory function k(x) := sin(1/x) for x € (0, 1] and (0) := 0 in 
the same way. 

One handles a function that becomes unbounded, or is highly oscillatory, at the right 
endpoint of the interval in a similar fashion. Furthermore, if a function g is unbounded, or is 
highly oscillatory, near some c € (a, b), then we define the “improper Riemann integral” 
to be 


b a b 
i ES ve 
These limiting processes are not necessary when one deals with the generalized 
Riemann integral. 
For example, we have seen in Example 10.1.10(a) that if H(x) := 2,/x for x € [0, 1] 
then H'(x) = 1/./x =: h(x) for x € (0, 1] and the Fundamental Theorem 10.1.9 asserts 
that h € R*[0, 1] and that 


5 
| oe) — H(0) =2. 


This example is an instance of a remarkable theorem due to Heinrich Hake, which we now 
state in the case where the function becomes unbounded or is oscillatory near the right 
endpoint of the interval. 


10.2.1 Hake’s Theorem Jff : [a,b] — R, then f € R*\a, b] if and only if for every y € 
(a,b) the restriction of f to |a, y] belongs to R*[a, y| and 


(2) lim [rnaer 


yob— 
b 
In this case f f =A. 
a 


The idea of the proof of the (<=) part of this result is to take an increasing sequence 
(y,) converging to b so that f € R*[a,y,] and lim, f?" f = A. In order to show that 
f € R*[a,b], we need to construct gauges on [a, b]. This is done by carefully “piecing 
together” gauges that work for the intervals [y;_,, y;| to obtain a gauge on [a, b]. Since the 
details of this construction are somewhat delicate and not particularly informative, we will 
not go through them here but refer the reader to [MTI]. 
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It is important to understand the significance of Hake’s Theorem. 


It implies that the generalized Riemann integral cannot be extended by taking limits as 
in (2). Indeed, if a function f has the property that its restriction to every subinterval 
[a, y], where y € (a,b), is generalized Riemann integrable and such that (2) holds, then 
f already belongs to R*[a, b]. 

An alternative way of expressing this fact is that the generalized Riemann integral does 
not need to be extended by taking such limits. 


One can test a function for integrability on [a, b] by examining its behavior on 
subintervals [a, y] with y < b. Since it is usually difficult to establish that a function 
is in R*|a, b] by using Definition 10.1.1, this fact gives us another tool for showing that a 
function is generalized Riemann integrable on [a, b]. 


It is often useful to evaluate the integral of a function by using (2). 


We will use these observations to give an important example that provides insight into 
the set of generalized Riemann integrable functions. 


10.2.2 Example (a) Let 5 a; be any series of real numbers converging to A € R. We 
k=1 
will construct a function g € R*[0, 1] such that 


1 oo 
f p=ŅX ak =A. 
0 k=1 


Indeed, we define g: [0,1] +R to be the function that takes the values 


2a), 2°a2,2°a3,... on the intervals [0,4), |+,3), [3,2),.... (See Figure 10.2.1.) For conve- 
nience, let cy, := 1 — 1/2* for k = 0,1,..., then 
_ 1 ae for Ch-1 < xX < (kK EN), 
GO r for x=l1. 


Figure 10.2.1 The graph of o 


Clearly the restriction of ọ to each interval [0, y] for y € (0, 1), is a step function and 
therefore is integrable. In fact, if y € [cn, €n+1) then 


J e= em: (3) Ee: (a) t+ ea: (a) +r 


= dı +A. +: + an + fy, 


where |r,| < |an+ı|. But since the series is convergent, then r, — 0 and so 


y n 

lim g = lim > ak =A. 
— — 

yr 0 n—-oo k=l 
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(b) If the series 5 ap is absolutely convergent in the sense of Definition 9.1.1, then it 


k=l 
follows as in (a) that the function |g] also belongs to R*[0, 1] and that 


1 oo 
i lol = Sola: 
& 0 k=1 


However, if the series 5 |a,| is not convergent, then the function |g| does not belong to 
R*(0, 1]. k=1 


Since there are many convergent series that are not absolutely convergent (for 
example, 5 (—1)‘/k), we have examples of functions that belong to R*(0, 1] but whose 


k=l 
absolute values do not belong to R*|0, 1]. We have already encountered such functions in 


Exercises 10.1.19 and 10.1.21. 


The fact that there are generalized Riemann integrable functions whose absolute value is 
not generalized Riemann integrable is often summarized by saying that the generalized 
Riemann integral is not an ‘“‘absolute integral.” Thus, in passing to the generalized Riemann 
integral we lose an important property of the (ordinary) Riemann integral. But that is the price 
that one must pay in order to be able to integrate a much larger class of functions. 


Lebesgue Integrable Functions 


In view of the importance of the subset of functions in R*|a, b] whose absolute values also 
belong to R*[a, b], we will introduce the following definition. 


10.2.3 Definition A function f € R*[a, b] such that |f| € R*[a, b| is said to be Leb- 
esgue integrable on [a, b]. The collection of all Lebesgue integrable functions on [a, b] is 
denoted by L[a, b]. 


Note The collection of all Lebesgue integrable functions is usually introduced in a totally 
different manner. One of the advantages of the generalized Riemann integral is that it 
includes the collection of Lebesgue integrable functions as a special—and easily identifi- 
able—collection of functions. 


It is clear that if f € R*[a,b] and if f(x) >0 for all x€ [a,b], then we have 
|f| =f € R* [a,b], so that f € Lia, b]. That is, a nonnegative function f € R*/a, b| 
belongs to £L[a, b]. The next result gives a more powerful test for a function in R*[a, b] 
to belong to Lia, b]. 


10.2.4 Comparison Test If f,w € R*[a, b] and | f(x)| < a(x) for all x € [a,b], then 


f €Lia,b] and 
[is < fins f o 


(3) 
Partial Proof. The fact that |f| € R*|a,5] is proved in [MTI]. Since |f| > 0, this 
implies that f € L[a, b]. 

To establish (3), we note that —|f| <f < |f| and 10.1.5(c) imply that 


-fais firs fin 
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whence the first inequality in (3) follows. The second inequality follows from another 
application of 10.1.5 (c). QED. 


The next result shows that constant multiples and sums of functions in L[a, b] also 
belong to Lia, b]. 


10.2.5 Theorem I/ff,g € Lia, b] and if c € R, then cf and f + g also belong to L{a, b]. 
Moreover 


(4) fonefs and [uses fue fret 


Proof. Since |cf(x)| = |c||f(x)| for all x € [a,b], the hypothesis that |f| belongs to 
R* [a,b] implies that cf and |cf| also belong to R*[a, b], whence cf € L[a, b]. 

The Triangle Inequality implies that | f(x) + g(x)| < |f(x)| + |g(x)| forall x € [a, b]. 
But since w := |f| + |g| belongs to R*|[a, b], the Comparison Test 10.2.4 implies that f + g 
belongs to L[a, b] and that 


[uses fi (F+ lel) -f Ifl + ik el. di 


The next result asserts that one only needs to establish a one-sided inequality in order 
to show that a function f € R*[a, b] actually belongs to L[a, b]. 
10.2.6 Theorem Jff € R*[a,b], the following assertions are equivalent: 


(a) f € Lj|a,b]. 
(b) There exists w € La, b] such that f(x) < w(x) for all x € [a,b]. 
(c) There exists a € Lia, b] such that a(x) < f(x) for all x € [a,b]. 


Proof. (a) = (b) Let w :=f. 


(b) = (a) Note that f = w — (w — f). Since w — f > 0 and since w — f belongs to 
R* [a,b], it follows that w — f € Lia, b]. Now apply Theorem 10.2.5. 


We leave the proof that (a) <=> (c) to the reader. Q.E.D. 


10.2.7 Theorem Jf f,g € Lla,b], then the functions max{ f,g} and min{ f, g} also 
belong to L{a, b]. 


Proof. It follows from Exercise 2.2.18 that if x € [a, b], then 


The assertions follow from these equations and Theorem 10.2.5. QED. 


In fact, the preceding result gives a useful conclusion about the maximum and the 
minimum of two functions in R*|a, b]. 
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10.2.8 Theorem Suppose that f,g,a, and œw belong to R* [a,b]. If 
fsogso orif a<fa<eg, 
then max{ f, g} and min{f, g} also belong to R*|a, b]. 


Proof. Suppose that f < @ and g < w; then max{f,g} < œ. It follows from the first 
equality in the proof of Theorem 10.2.7 that 


0< |f -g| = 2max{f,g8} -f -g <2w-f-g. 


Since 2w — f — g > 0, this function belongs to L[a,b]. The Comparison Test 10.2.4 
implies that 2 max{ f, g} — f — g belongs to L[a, b], and so max{ f, g} belongs to R*[a, b]. 
The second part of the assertion is proved similarly. Q.E.D. 


The Seminorm in £/a, b] 


We will now define the “‘seminorm”’ of a function in £[a, b] and the “distance between” 
two such functions. 


10.2.9 Definition Iff € Lia,b], we define the seminorm of f to be 


b 
fll = f ifl. 


If f,g € Lia, b], we define the distance between f and g to be 


b 
dist(f,2) = [fell = f If- sl 
a 
We now establish a few properties of the seminorm and distance functions. 


10.2.10 Theorem The seminorm function satisfies: 


O if] > Oforallf € Cla, b} 
(ii) If f(x) = 0 for x € [a,b], then ||f|| = 0. 

Gii) If f € Lia, b] and c E R, then |cf|| = lcl- Ifl]. 
(iv) If f,g € Lla, b], then |f + gl| < IFI + lll. 


Proof. Parts (i)—(iii) are easily seen. Part (iv) follows from the fact that |f + g| < |f|+ 
|g| and Theorem 10.1.5(c). QED. 


10.2.11 Theorem The distance function satisfies: 

© dist(f, g) > 0 for all f, g € Lia, 6). 

(ii) If f(x) = g(x) for x € [a,b], then dist(f,g) = 0. 

Gii) dist(f, g) = dist(g,f) for all f, g € La, b]. 

(iv) dist(f,h) < dist(f, g) + dist(g, h) for all f, g,h € La, b]. 


These assertions follow from the corresponding ones in Theorem 10.2.10. Their proofs 
will be left as exercises. 

Using the seminorm (or the distance function) we can define what we mean for a 
sequence of functions (f,,) in £[a, b] to converge to a function f € L[a, b]; namely, given 


10.2 IMPROPER AND LEBESGUE INTEGRALS 307 


any € > 0 there exists K(¢) such that if n > K(e) then 


ISh -f| = dist( fn, f) <é 


This notion of convergence can be used exactly as we have used the distance function in R 
for the convergence of sequences of real numbers. 

We will conclude this section with a statement of the Completeness Theorem for 
L|a, b] (also called the Riesz-Fischer Theorem). It plays the same role in the space L[a, b] 
that the Completeness Property plays in R. 


10.2.12 Completeness Theorem A sequence (f,,) of functions in L[a, b] converges to a 
function f € Lia, b| if and only if it has the property that for every ¢ > 0 there exists H(&) 
such that if m,n > H(e) then 


lfm = fall T dist( fm, fn) <é 


The direction (=) is very easy to prove and is left as an exercise. A proof of the direction (=) 
is more involved, but can be based on the following idea: Find a mubeeailenee (gx) = (Fn) of 


(Fn) such that lg y1 — gall < 1/2" and define f(x) := g1 (x y+ Bi1(x) — gx (x)), where 


this series is absolutely convergent, and f(x) := 0 elsewhere. Ite am then be shown that f € 


L{a, b] and that || f,, — f|| — 0. (The details are given in [MTI].) 


Exercises for Section 10.2 


1. Show that Hake’s Theorem 10.2.1 can be given the following sequential formulation: A function 
f € R* [a,b] if and only if there exists A € R such that for any increasing sequence (c,) in (a, b) 
with c, — b, then f € R*[a,c,] and ff — A. 

2. (a) Apply Hake’s Theorem to conclude that g(x) := 1/x?/? for x € (0,1] and g(0) := 0 

belongs to R*[0, 1]. 

(b) Explain why Hake’s Theorem does not apply to f(x) := 1/x for x € (0, 1] and f(0) := 
(which does not belong to R*[0, 1}). 

Apply Hake’s Theorem to g(x) := (1 — x)" for x € [0, 1) and g(1) := 0. 

4. Suppose that f € R*[a, c] for all c € (a, b) and that there exists y € (a, b) and w € L[a,b] such 
that | f(x)| < w(x) for x € [y, b]. Show that f € R*[a, b]. 

5. Show that the function g(x) := x7'/?sin(1/x) for x € (0, 1] and g,(0) := 0 belongs to £[0, 1]. 
(This function was also considered in Exercise 10.1.15.) 


6. Show that the following functions (properly defined when necessary) are in £[0, 1]. 


xInx b sin m xX 
© Tp Cares 
© (Inx)(In(1 = x)), (a) a 


7. Determine whether the following integrals are convergent or divergent. (Define the integrands to 
be 0 where they are not already defined.) 


sin x dx l cos x dx 
(a) i Ds (b) [Sr 
© [3s Inxdx (a) pe 
xvi — x2’ o 1-x’ 


(e) [ena9¢sin(1/x) a ®© f oxy 
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If f € Rla, b], show that f € L[a, b]. 
9. Iff € Lia, b], show that f? is not necessarily in L[a, b]. 


10. Iff, g € L[a,b] and if g is bounded and monotone, show that fg € La, b]. More exactly, if |g(x)| 
< B, show that || fg|| < BIF. 


11. (a) Give an example of a function f € R*[0, 1] such that max {f, 0} does not belong to R*[0, 1]. 
(b) Can you give an example of f € £[0, 1] such that max{ f,O}  £[0, 1]? 


12. Write out the details of the proof that min{ f, g} € R*[a, b] in Theorem 10.2.8 when a < f and 
æ< g. 


13. Write out the details of the proofs of Theorem 10.2.11. 

14. Give an f € L[a, b] with f not identically 0, but such that || f|| = 0. 
15. If f,g € L[a, b], show that ||| f|| — lisli] < If + gll- 

16. Establish the easy part of the Completeness Theorem 10.2.12. 


17. If f(x) := x” for n € N, show that f, € £[0, 1] and that || f,,|| — 0. Thus ||f,, — 6|| — 0, where @ 
denotes the function identically equal to 0. 


18. Let g,(x) := —1 for x € [—1, —1/n), let g,(x) := nx for x € [—1/n, 1/n] and let g,(x) := 1 for x € 
(1/n, 1]. Show that ||g,,, — g,,|| — 0 as m, n — ov, so that the Completeness Theorem 10.2.12 implies 
that there exists g € L|—1, 1] such that (g,,) converges to g in £[—1, 1]. Find such a function g. 


19. Let h,(x) := n for x € (0, 1/n) and h,,(x) := 0 elsewhere in [0, 1]. Does there exist A € £[0, 1] 
such that ||/,, — A|| — 0? 


20. Let k,(x):=n for x € (0, 1/n’) and k,,(x) := 0 elsewhere in [0, 1]. Does there exist k € £[0, 1] 
such that ||k, — k|| — 0? 


Section 10.3 Infinite Intervals 


In the preceding two sections, we have discussed the integration of functions defined on 
bounded closed intervals [a, b]. However, in applications we often want to integrate 
functions defined on unbounded closed intervals, such as 


[a, 00), (—oo, 5, or (—00, 00). 
In calculus, the standard approach is to define an integral over [a, oo) as a limit: 
00 Y 
f f := lim Í, 
a ETOR a 


and to define integrals over the other infinite intervals similarly. In this section, we will treat 
the generalized Riemann integrable (and Lebesgue integrable) functions defined on infinite 
intervals. 

In defining the generalized Riemann integral of a function f on [a, 00), we will 
adopt a somewhat different procedure from that in calculus. We note that if Q:= 


{lxo x1], t1), ++- Xn- Xn]; tn), (Xn; 00]; fnvt)} is a tagged partition of [a, oo], then 
Xo = a and x,,,; = co and the Riemann sum corresponding to Q has the form: 
(1) F) = X0) + ES (tn) n = Xn-1) +F Cnt) (00 = xn). 


Since the final term f(f,11)(co — xn) in (1) is not meaningful, we wish to suppress this 
term. We can do this in two different ways: (i) define the Riemann sum to contain only the 
first n terms, or (ii) have a procedure that will enable us to deal with the symbols + oo in 
calculations in such a way that we eliminate the final term in (1). 
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We choose to adopt method (i): instead of dealing with partitions of [a, oo) into a finite 
number of non-overlapping intervals (one of which must necessarily have infinite length), we 
deal with certain subpartitions of [a, 00), which are finite collections of non-overlapping 
intervals of finite length whose union is properly contained in [a, 00). 

We define a gauge on [a, co] to be an ordered pair consisting of a strictly positive 
function ô defined on [a, oo) and a number d* > 0. When we say that a tagged subpartition 


P := {([x0, x1], t1),---, ([Xn-15 Xn], tn)} is (6, d*)-fine, we mean that 
(2) |a, oo) = U [xi-1, Xi] U [Xn oo), 
i=l 
that 
(3) [xi-1, Xi] C [ti — 8(t;), ti + 8lti)] for i=1,...,n, 
and that 
(4) [xn 00) C [I /d*, 00) 


or, equivalency, that 

(4) 1/d* < Xp. 

Note Ordinarily we consider a gauge on [a, oo] to be a strictly positive function ô with 
domain |a, co] := [a, co) U {co} where d(o0) := d“. 


We will now define the generalized Riemann integral over [a, oo). 


10.3.1 Definition (a) A function f : [a,oo) — R is said to be generalized Riemann 
integrable if there exists A € R such that for every ¢ > 0 there exists a gauge 6, on [a, co] 
such that if P is any 6,-fine tagged subpartition of [a, oo), then SCF: P)- Al < e. In this 
case we write f € R*[a,oo) and 

f:=A. 


(b) A function f : [a,co) — R is said to be Lebesgue integrable if both f and |f| belong 
to R*[a, oo). In this case we write f € L[a, oo). 


Of particular importance is the version of Hake’s Theorem for functions in R*[a, co). 


Other results for functions in £[a, o0) will be given in the exercises. 


10.3.2 Hake’s Theorem Jff : [a,oo) — R, then f € R*[a, co) if and only if for every 
y € (a,c) the restriction of f to [a, y] belongs to R*|a, y] and 


(5) lim Tenen 


yoo a 
CO 
In this case | f=aA. 
a 


The idea of the proof of Hake’s Theorem is as before; the details are given in [MTI]. 


The generalized Riemann integral on the unbounded interval [a, oo) has the same 
properties as this integral on a bounded interval [a, b] that were demonstrated in 
Section 10.1. They can be obtained by either modifying the proofs given there, or by 
using Hake’s Theorem. We will give two examples. 
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10.3.3 Examples (a) If f,g € R*[a, oo), then f + g € R*[a, co) and 


[ro=[ t+ foe 


If ¢ >0 is given let ôp be a gauge on [a, oo] such that if P is d-fine, then 
SCF: P) — fy f| </2, and there exists a gauge ô, such that if P is 6,-fine, then 
|S(g; P) — f° g| < e/2. Now let 5,(t) := min{ôp(f), 5,(¢)} for t € [a, 00] and argue as 
in the proof of 10.1.5(b). 

(b) Let f : [a,co) — R and let c € (a, oo). Then f € R*[a, oo) if and only if its restric- 
tions to [a, c] and [c, oo) are integrable. In this case, 


(6) porafre fis 


We will prove (<=) using Hake’s Theorem. By hypothesis, the restriction of f to [c, 00) 
is integrable. Therefore, Hake’s Theorem implies that for every y € (c, 00), the restriction 
of f to [c, y] is integrable and that 


00 y 
I f= lim | f. 
c amt 
If we apply the Additivity Theorem 10.1.8 to the interval [a, y] = [a,c] U [c, y], we 
conclude that the restriction of f to [a, y] is integrable and that 


Y c y 
fr=-f[ tfs 
whence it follows that 


im [f= fr tim [r= [r+ t 


Another application of Hake’s Theorem establishes (6). 


10.3.4 Examples (a) Leta > 1 and let f(x) :=1/x* for x € [1, 00). We will show that 
fa E RŽ[L, 00). 
Indeed, if y € (1,00) then the restriction of f, to [1, y] is continuous and therefore 
belongs to R*[1, y]. Moreover, we have 
Y 1 1 1 
oad 


But since the last term tends to 1/(a — 1) as y — oo, Hake’s Theorem implies that fọ € 
R*[1, 00) and that 


1 
a when a> l. 
1 a — 


[0,6] 

(b) Let D a, be a series of real numbers that converges to A € R. We will construct a 
k=1 

function s € R*[0,0o) such that 


f s= oq =A. 


k=1 
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Indeed, we define s(x) := a; for x € |k — 1, k), k € N. It is clear that the restriction 
of s to every subinterval [0, y] is a step function, and therefore belongs to R*[0, y]. 
Moreover, if y € [n, n+ 1), then 


y 
| SH Qtr +a +7y,, 
0 


where Ir] < |an+1|. But since the series is convergent, then r, — 0 and so Hake’s 
Theorem 10.3.2 implies that 


(c) If the function s is defined as in (b), then |s| has the value |a;| on the interval 


[A — 1, k), k € N. Thus s belongs to £[0, oo) if and only if the series y |ax| is convergent; 
oo = 
that is, if and only if 5 ax is absolutely convergent. i 
k=l 
(d) Let D(x) := (sinx)/x for x € (0, œo) and let D(O) := 1. We will consider the 
important Dirichlet integral: 


J D(x)dx= | SOX 1x. 
0 o xX 


Since the restriction of 7 to every interval [0, y] is continuous, this restriction belongs 
to R*[0, y]. To see that Jj D x)dx has a limit as y — 00, we let 0 < £ < y. An integration 


by parts shows that 
y B Vo 
i D(s)dx— | D(x)dx I SoY dx 
0 0 p ~* 
y y 
cos x | sarn 
B Bp X 


But since |cos x| < 1, it is an exercise to show that the above terms approach 0 as 6 < y 
tend to oo. Therefore the Cauchy Condition applies and Hake’s Theorem implies that 
D € R*[0, ov). 

However, it will be seen in Exercise 13 that |D| does not belong to R*[0, o0). Thus the 
function D does not belong to £[0, 00). 


X 


We close this discussion of integrals over [a, oo) with a version of the Fundamental 
Theorem (First Form). 


10.3.5 Fundamental Theorem Suppose that E is a countable subset of |a, 00) and that 
f F : (a, œ) — R are such that: 


(a) F is continuous on |a, oo) and lim F(x) exists. 
(b) F'(x) = f(x) for all x € (a, 00), Er: E. 
Then f belongs to R*|a, oo) and 


(7) f t= im Fo - F(a). 


x-0CO 


Proof. If y is any number in (a, co), we can apply the Fundamental Theorem 10.1.9 to the 
interval [a, y] to conclude that f belongs to R*/a, y| and 


[f-F-Fe@. 
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Letting y — oo, we conclude from Hake’s Theorem that f € R*[a, co) and that equation 
(7) holds. Q.E.D. 


Integrals over (—oo, b] 


We now discuss integration over closed intervals that are unbounded below. 

Let b € R and g : (—o0, b] — R be a function that is to be integrated over the infinite 
interval (—oo, b]. By a gauge on [—co, b] we mean an ordered pair consisting of a number 
d, > 0 and a strictly positive function ô on (—oo, b). We say that a tagged subpartition 
P := {([xo, x1], t1), ([x1, x2], t2), ---, ([Xn-1, 5], tn)} of (—00, b) is (d,, 6)-fine in the 
case that 


(—00, b] = (—00, xo] U |] x1, x; 
i=l 
that 
[xi-1, xi] C [ti — Elti), ti + 8(4)]| for i=1,...,n, 

and that 

(—00, xo] C (—co, -1/d,] 
or, equivalently, that 

xo < —1/d;. 
Note Ordinarily we consider a gauge on |[—oo, b] to be a strictly positive function 6 with 
domain [—oo, b] := {—co} U (œœ, b] where 5(—o00) := dy. 
Here the Riemann sum of g for P is S(g; P) = Se gelti) (xi — X;-1). 
j=] 


Finally, we say that g : (—co, b] — R is generalized Riemann integrable if there 
exists B € R such that for every ¢ > 0 there exists a gauge 6, on (—oo, b] such that if P is 
any 6,-fine subpartition of (—oo, b], then |S(g; P) — B| < e. In this case we write g € 


R*(—o0, b| and 
b 
J g=B. 


Similarly, a function g : (—oo, b] — R is said to be Lebesgue integrable if both g and |g| 
belong to R*(—oo, b]. In this case we will write g € L(—oo, b]. 


The theorems valid for the integral over [a, 00] are obtained in this case as well. Their 
formulation will be left to the reader. 


Integrals over (—0o, 00) 


Let h : (—co, oo) — R be a function that we wish to integrate over the infinite interval 
(—oo, co). By a gauge on (—00, 00) we mean a triple consisting of a strictly positive function 
ô on (—oo, oo) and two strictly positive numbers d.., d“. We say that a tagged subpartition 
P := {([xo, x1], ti), ([%1, x2], t2), <- , ([Xn-1; Xn], tn) }is (d+, 6, d*)-fine in the case that 
(00, 00) = (=00, xo] UL [e1 2] U bn, 00), 
i=l 
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that 
[xi-1, xi] C [ti — 8lti), ti + 8(t)] for i=1,...,n, 
and that 
(—o0, xo] E (=œ, — 1/d,] and [Xn, 00) C [1/d*, o0) 
or, equivalently, that 
xo < —1/d, and 1/d* < xn. 
Note Ordinarily we consider a gauge on [—oo, œo] to be a strictly positive function 


ô with domain [—oo, oo] := {—co} U (—o0, co) U {oo} where 5(—o0) := d, and 5(co) 
=d"*. 


Here the Riemann sum of h for P is S(h; P) = X A(t) (xi — Xi-1). 
i=l 


Finally, we say that h : (—oo, oo) — R is generalized Riemann integrable if there 
exists C € R such that for every ¢ > 0 there exists a gauge ô, on [—00, oo] such that if P is 
any 6,-fine subpartition of (—oo, oo), then |S(A; P) = c| < e. In this case we write h € 


R*(—o0, co) and 
Ta 


Similarly, a function ⁄ : (—00, co) — R is said to be Lebesgue integrable if both / and |h| 
belong to R*(—oo, oo). In this case we write h € L(—o0, ov). 


In view of its importance, we will state the version of Hake’s Theorem that is valid for 
the integral over (—o0, 00). 


10.3.6 Hake’s Theorem Zfh : (—co, 00) > R, then h € R*(—cx, 00) ifand only if for 
every B < y in (—œ, 00), the restriction of h to [B, y] is in R*|B, y| and 


y 
lim h=CER. 
B>- B 
Y> +00 


In this case J h= C. 


As before, most of the theorems valid for the finite interval [a, b] remain true. They 
are proved as before, or by using Hake’s Theorem. We also state the first form of the 
Fundamental Theorem for this case. 


10.3.7 Fundamental Theorem Suppose that E is a countable subset of (~œ, œo) and 
that h, H : (—co, œ) > R satisfy: 


(a) H is continuous on (—co, o) and the limits lim H(x) exist. 
b) H’(x) = h(x) for all x € (—00, œ), x¢ E. 


Then h belongs to R*(—o0, oo) and 


(8) h= lim H(x)— lim H(y). 
EA x— o0 y——00 
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10.3.8 Examples (a) Let A(x) :=1/(x7+1) for x€(-oo,co). If we let 
H(x) := Arctan x, then H' (x) = h(x) for all x € (—00, 00). Further, we have lim H(x) = 
sand lim H(x) = —4z. Therefore it follows that — 


ae | 
| rner- (—4n) =z. 


(b) Let k(x) := |x]e™® for x € (—00, 00). If we let K(x) := 5(1- e-*’) for x > 0 and 
K(x) := -4 (1 — e™%) for x < 0, then it is seen that K is continuous on (—oo, o0) and that 
K'(x) = k(x) for x #0. Further, lim K(x)=4 and lim K(x) =—}. Therefore it 


follows that 


[ble Wax =4- (9) =1. 


o0 


Exercises for Section 10.3 


1. Let 6 be a gauge on [a, oo]. From Theorem 5.5.5, every bounded subinterval [a, b] has a ô-fine 
partition. Now show that [a, oo] has a 6-fine partition. 


2. Letf € R*[a, y] for all y > a. Show that f € R*[a, oo) if and only if for every e > 0 there exists 
K(e) > a such that if q > p > K(e), then I Pel <E. 

3. Let fand |f| belong to R*[a, y] for all y > a. Show that f € Lla, co) if and only if for every 
e > 0 there exists K(e) > a such that if q > p > K(e) then L |f| < e. 


4. Let fand |f| belong to R*|a, y] for every y > a. Show that f € L[a, oo) if and only if the set 
V := { [7 |f| : x > a} is bounded in R. 


5. Iff, g € Lia, co), show that f + g € L[a, o0). Moreover, if ||h|| := f7 |h| for any h € Lfa, ov), 
show that ||f + a| < [AI + llall- 

If f(x) := 1/x for x € [1,00), show that f ¢ R*[1, 00). 

If f is continuous on [1,00) and if |f(x)| < K/x* for x € [1,00), show that f € L[1, 00). 
Let f(x) := cos x for x € [0, 00). Show that f ¢ R*[0, ov). 

If s > 0, let g(x) := e™™ for x € [0, 00). 


(a) Use Hake’s Theorem to show that g € L[0, 00) and fọ e~*dx = 1/s. 
(b) Use the Fundamental Theorem 10.3.5. 


10. (a) Use Integration by Parts and Hake’s Theorem to show that ee xe “dx = 1/ s for s > 0. 
(b) Use the Fundamental Theorem 10.3.5. 

11. Show that if n € N,s > 0, then [5° x'e dx = n\/s"*". 

12. (a) Show that the integral f i? x7! In x dx does not converge. 
(b) Show that if œ > 1, then fẹ x-“Inx dx = 1/(a — 17. 

13. (a) Show that JE x7!sin x|dx > 1/4(n + 1). 


(b) Show that |D|  R*[0, o0), where D is as in Example 10.3.4(d). 
14. Show that the integral fọ (1/yx)sin x dx converges. [Hint: Integrate by Parts.] 


O n OAO 


15. Establish the convergence of Fresnel’s integral ie sin(x?)dx. [Hint: Use the Substitution 
Theorem 10.1.12.] 
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16. Establish the convergence or the divergence of the following integrals: 


@ [we b) [ Inx dx 
o AFT ST 
cc dx aL 
© f a Oia 
5 * Arctan x dx 
o fs o [Sa 


17. Letf,g : [a,co) — R. Abel’s Test asserts that if f € R*[a, 00) and g is bounded and monotone 
on [a, 00), then fy € R*[a, 00). 

(a) Show that Abel’s Test does not apply to establish the convergence of Id /x)sin x dx by 

taking g(x) := 1/x. However, it does apply if we take g(x) := 1/,/x and use Exercise 14. 
(b) Use Abel’s Test and Exercise 15 to show the convergence of fọ (x/(x + 1)) sin(x?) dx. 
(c) Use Abel’s Test and Exercise 14 to show the convergence of Je x73/2(x + 1) sin x dx. 
(d) Use Abel’s Test to obtain the convergence of Exercise 16(f). 

18. With the notation as in Exercise 17, the Chartier-Dirichlet Test asserts that if f € R*[a, y] for 
all y > a, if F(x) := f7 f is bounded on [a, 00), and if p is monotone and lim (x) = 0, then 
fo € R*[a, oo). Aa 
(a) Show that the integral fọ (1/x) sin x dx converges. 

(b) Show that f>°(1/In x) sin x dx converges. 

(c) Show that f,°(1/,/x) cos x dx converges. 

(d) Show that the Chartier-Dirichlet Test does not apply to establish the convergence of 
Jo. (x/(x + 1) sin(x?)dx by taking f(x) := sin(x?). 

19. Show that the integral te ve X - sin(x”)dx is convergent, even though the integrand is not 
bounded as x — oo. [Hint: Make a substitution.] 


20. Establish the convergence of the following integrals: 


(a) io edx, (b) f x— 2)je dx, 
is J edx, (d) T a dx 


Section 10.4 Convergence Theorems 


We will conclude our discussion of the generalized Riemann integral with an indication of 
the convergence theorems that are available for it. It will be seen that the results are much 
stronger than those presented in Section 8.2 for the (ordinary) Riemann integral. Finally, 
we will introduce a “measurable” function on fa, b] as the almost everywhere limit of a 
sequence of step functions. We will show that every integrable function is measurable, and 
that a measurable function on [a, b] is generalized Riemann integrable if and only if it 
satisfies a two-sided boundedness condition. 

We proved in Example 8.2.1(c) that if (f;,) is a sequence in R{a, b] that converges on 
[a, b] to a function f € R{a, b], then it need not happen that 


0 [4 hm [te 


However, in Theorem 8.2.4 we saw that uniform convergence of the sequence is sufficient 
to guarantee that this equality holds. In fact, we will now show that this is even true for a 
sequence of generalized Riemann integrable functions. 
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10.4.1 Uniform Convergence Theorem Let (f;,) be a sequence in R* |a, b] and suppose 
that (f) converges uniformly on [a, b] to f. Then f € R*[a, b] and (1) holds. 


Proof. Given € > 0, there exists K (e) such that if k > K(e) and x € [a, b], then we have 
| f(x) —f(x)| < e. Consequently, if h,k > K(e), then 

—2e < flx) — f(x) <2e for x € [a,b]. 
Theorem 10.1.5 implies that 


b b 
-204(0-a) < | fe- f fa< lb- a) 


Since € > 0 is arbitrary, the sequence ( f? fy) is a Cauchy sequence in R and therefore 
converges to some number, say A € R. We will now show that f € R*[a, b] with integral A. 
For, if ¢ > 0 is given, let K(e) be as above. If P := {([x;1, x], ti}; is any tagged 
partition of [a, b] and if k > K(e), then 


Now fix r > K(e) such that IF, —A| < £ and let 6,, be a gauge on [a, b] such that 
Lf? f, —S(f,;P)| < whenever P is 6,.,-fine. Then we have 
[ana 


But since ¢ > 0 is arbitrary, it follows that f € R*[a, b] and f US =A. Q.E.D. 


ISP) — A] < [S05 P) — SFP) + [SUP =f #\+ 
< e(b-a)+e+e=e(b-—at2). 


It will be seen in Example 10.4.6(a) that the conclusion of 10.4.1 is false for an infinite 
interval. 


Equi-integrability 


The hypothesis of uniform convergence in Theorem 10.4.1 is a very stringent one and 
restricts the utility of this result. Consequently, we now show that another type of 
uniformity condition can be used to obtain the desired limit. This notion is due to Jaroslav 
Kurzweil, as is Theorem 10.4.3. 


10.4.2 Definition A sequence (f;,) in R*(I) is said to be equi-integrable if for every 
e > 0 there exists a gauge 6, on / such that if P is any 6,-fine partition of J and k € N, then 


|S( (ae )= fifi] < €- 


10.4.3 Equi-integrability Theorem If (fp) € R*(I) is equi-integrable on I and if 
f(x) =limf;,(x) for all x € I, then f € R* (I) and 


2) [t= im [te 
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Proof. We will treat the case I = [a,b]; the general case can be found in [MTI]. 

Given e > 0, by the equi-integrability hypothesis, there exists a gauge 6, on J such that 
if P := {([xi-1, xi], ti) }7_, is a 5,-fine partition of Z, then we have |S(f,;P) — fify] < ¢ for 
all k € N. Since P has only a finite number of tags and since f,(t) — f(t) for t € [a,b], 
there exists a K, such that if h,k > K,, then 


(3) SEP) — SSP < YO fle) (t;)|(x; — xi-1) < elb — a). 


If we let A — co in (3), we have 
(4) ISfisP) — S(F3P)| < e(b-a) for k> K. 


Moreover, if h,k > K,, then the equi-integrability hypothesis and (3) give 
Sre- fta s| [te SEP] + SEEP) - 54?) 


fst) Z f fi 


Since ¢ > 0 is arbitrary, then ( f if) is a Cauchy sequence and converges to some A € R. If 
we let h — oo in this last inequality, we obtain 


<eteb—a)+e=6(2+b-a). 


(5) [tea] s+- a for k> Kg. 


We now show that f € R*(Z) with integral A. Indeed, given € > 0, if P is a 5,-fine 
partition of J and k > K,, then 


ISP) -A| < |SP) - SUF sP] + |s (fis P) - [ts + [te-4l 
he E a 


where we used (4) for the first term, the equi-integrability for the second, and (5) for the 
third. Since ¢ > 0 is arbitrary, f € R*(I) with integral A. Q.E.D. 


The Monotone and Dominated Convergence Theorems 


Although the Equi-integrability Theorem is interesting, it is difficult to apply because it is 
not easy to construct the gauges 6,. We now state two very important theorems summariz- 
ing the most important convergence theorems for the integral that are often useful. McLeod 
[pp. 96-101] has shown that both of these theorems can be proved by using the Equi- 
integrability Theorem. However, those proofs require a delicate construction of the gauge 
functions. Direct proofs of these results are given in [MTI], but these proofs also use results 
not given here; therefore we will omit the proofs of these results. 

We say that a sequence of functions on an interval J C R is monotone increasing if it 
satisfies f (x) < f(x) <--> <fi(x) < f1 (X) < ++ for all k € N, x € I. It is said to be 
monotone decreasing if it satisfies the opposite string of inequalities, and to be monotone 
if it is either monotone increasing or decreasing. 


10.4.4 Monotone Convergence Theorem Let (f) be a monotone sequence of func- 
tions in R* (I) such that f(x) = limf;,(x) almost everywhere on I. Then f € R* (T) if and 
only if the sequence of integrals ( S: fr) is bounded in R, in which case 


(6) [t= jim [fe 
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The next result is the most important theorem concerning the convergence of 
integrable functions. It is an extension of the celebrated “Lebesgue Dominated Conver- 
gence Theorem” from which it can also be proved. 


10.4.5 Dominated Convergence Theorem Let (f,,) be a sequence in R*(I) and let 
f(x) = lim f(x) almost everywhere on I. If there exist functions æ, w in R*(1) such that 
(7) a(x) <f,(x) <@(x) foralmostevery x €l, 

then f € RŽ (I) and 


8) [f= fim [te 


Moreover, if a and œ belong to L(I), then f, and f belong to L(I) and 


(9) eae f osr 


Note If« and œw belong to L(I), and we put y := max{ |a|, |w|}, then g € L(/) and we can 
replace the condition (7) by the condition 


(7) |fe(x)| < g(x) foralmostevery x EI. 


Some Examples 


10.4.6 Examples (a) If k EN, let f(x) := 1/k for x€ [0,k] and f,(x) := 0 else- 
where in [0, 00). 

Then the sequence converges uniformly on (0,00) to the 0-function. However, 
ibe fp = 1 for all k € N, while the integral of the 0-function equals 0. It is an exercise 
to show that the function sup{f;,(x) :k € N} does not belong to R*[0,0o), so the 
domination condition (7) is not satisfied. 


+ 1 
dx = 3. 


For, if g(x) := (x* + 1)/(x* +3), then 0< g,(x) <1 and g,(x) > 1/3 for 
x € [0, 1). Thus the Dominated Convergence Theorem 10.4.5 applies. 


k xk 
(c) We have jim f (1 + =) e “dx = 
— 00 0 


x* 4 
k 


1 
(b) We have lim J 
k-00 0 


X 


V 
= ifa > 1. 

Let h(x) := (1 + x/k)‘e~® for x € [0, k] and /y(x) := 0 elsewhere on (0, 00). The 
argument in Example 3.3.6 shows that (h) is an increasing sequence and converges to 
eve = ell-9* on [0, 00). If a > 1 this limit function belongs to £L[0, 00). Moreover, if 
F(x) := e!-®*/(1 — a), then F'(x) = e(!-®* so that the Monotone Convergence Theo- 
rem 10.4.4 and the Fundamental Theorem 10.3.5 imply that 


oo oo o0 1 
jim f hk = el!-4)*qx = F(x) EE 


(d) If fis bounded and continuous on [0, co) and if a > 0, then the function defined by 
L(t) := fo ef (x)dx is continuous for t € Ja := (a, 00). 

Since |e"“f(x)| < Me~™ for t € Ja, if (tk) is any sequence in Ja converging to 
to € Ja, the Dominated Convergence Theorem implies that L(t,) — L(to). But since the 
sequence (fk) — fo is arbitrary, then L is continuous at fo. 
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(e) The integral in (d) is differentiable for t > a and 


(10) u(y = [ "(xe (add, 


which is the result obtained by ‘‘differentiating under the integral sign” with respect to t. 

Fix a number fo € Ja. If t € Ja, then by the Mean Value Theorem applied to the 
function t+ e~, there exists a point ¢, between ft) and ¢ such that we have 
e™ — e * = —xe~'*(t — to), whence 


Since w(x) := xe “f(x) belongs to £L[0, o0), then for any sequence (t4) in Ja with 
to # tk — to, the Dominated Convergence Theorem implies that 


lim A) =f ~ lim Hl Plo) dx 


k-00 tk — to tk — to 


= f (=x) "f(x)dx. 
0 
Since (tx) is an arbitrary sequence, then L’(fo) exists and (10) is proved. 


A k 
(f) Let D(t) = I e ™ (= =) dx fork € N,t > 0. 
0 


x 


Since |(e~sin x)/x| < e7™™ < 1 for t > 0,x > 0, the integral defining Dx exists. In 


particular, we have 
k . 
D,(0) = | SoY gx. 
0 


x 


We want to show that Dy(0) — 4x as k — oo. By Example 10.3.4(d), this will show that 
te (sin x)/x dx = in. The argument is rather complex, and uses the Dominated Conver- 


gence Theorem several times. 
ð (e ™sin x , 
F = |—e “sin x| < 1 for t > 0, 


Since the partial derivative satisfies 
x 


x > 0, an argument as in (e) and the Dominated Convergence Theorem imply that 


k 
D';(t) = -f e™sinxdx for keéEN,t>0. 
0 


e (tsin x + cos x) 
+1 


i ; . o ; 
Since a routine calculation shows that 5 = —e “sin x, then an 
X 
application of the Fundamental Theorem gives 


e~*(t sink + cos k) 1 
PHI P+ 


D';(t) = 


e~! (tsin k + cos k) 
P+1 
application of the Fundamental Theorem gives 


k k k 
(11) Di(k) ~ Dx(0) = f Duly =f eoar- f 


e+) 


If we put g,(t) := for0<t<kand gx (t) := 0 for t > k, then another 


=| g,(t)dt — Arctan k. 
0 
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If we note that g(t) — 0 for t > 0 as k — œ and that (since k > 1) 


Me ot se for t>0 


t) < 
OEE E 


a 


then the Dominated Convergence Theorem gives S gı(t)dt — 0. 
In addition, since |(sin x)/x| < 1, we have 


k . k —kx |\x=k 
|Di(k)| = f ee Egy < | ed =* 
0 x 0 | =k |,-0 
l-e* 1 
= < — 
p Sp?’ 


Therefore, as k — oo, formula (11) becomes 
0— jim D,(0) = 0 — jim Arctan k = — 4x. 


As we have noted before, this gives an evaluation of Dirichlet’s Integral: 


CO as 1 
(12) f SX Ix = ir. 
0 2 


x 


Measurable Functions 


We wish to characterize the collection of functions in R* (I). In order to bypass a few minor 
details, we will limit our discussion to the case 7 := [a, b]. We need to introduce the notion of a 
‘measurable function”; this class of functions contains all the functions the reader is ever 
likely to encounter. Measurable functions are often defined in terms of the notion of a 
“measurable set.” However, the approach we will use is somewhat simpler and does not 
require a theory of measurable sets to have been developed first. (In fact, the theory of 
measure can be derived from properties of the integral; see Exercises 15 and 16.) 

We recall from Definition 5.4.9 that a function s : [a, b] — Risa step function if it has 
only a finite number of values, each value being assumed on a finite number of subintervals 
of [a, b]. 


10.4.7 Definition A functionf : [a,b] — Ris said to be (Lebesgue) measurable if there 
exists a sequence (s4) of step functions on [a, b] such that 


(13) f(x) = jim SK (x) for almost every x € [a,b]. 
We denote the collection of all measurable functions on [a, b] by M{a, b]. 


We can reformulate the definition as: A function fis in M[a, b] if there exists a null set 
Z C [a, b] and a sequence (s;) of step functions such that 


(14) f(x) = jim s(x) for all x € [a, b]\Z. 


It is trivial that every step function on [a, b] is a measurable function. By Theorem 5.4.10, 
a continuous function on [a, b] is a uniform limit of a sequence of step functions; therefore, 
every continuous function on an interval [a, b] is measurable. Similarly, every monotone 
function on [a, b] is a uniform limit of step functions (see the proof of Theorem 7.2.8); 
therefore, every monotone function on an interval is measurable. 
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At first glance, it might seem that the collection of measurable functions might not be 
so very large. However, the requirement that the limit (13) is required to hold only almost 
everywhere (and not everywhere), enables one to obtain much more general functions. We 
now give a few examples. 


10.4.8 Examples (a) The Dirichlet function, f(x) := 1 for x € [0, 1] rational and 
f(x):= 0 for x € [0, 1] irrational, is a measurable function. 
Since QA (0, 1] is a null set, we can take each s+ to be the 0-function. We then obtain 
s(x) — f(x) for x € [0, 1]\Q. 
(b) Thomae’s function h (see Examples 5.1.6(h) and 7.1.7) is a measurable function. 
Again, take sx to be the 0-function. Then s(x) — h(x) for x € [0, 1]\Q. 
(c) The function g(x) := 1/x for x € (0, 1] and g(0):= 0 is a measurable function. 
This can be seen by taking a step function s(x) := 0 for x € [0,1/) and (using 5.4.10) 
such that |s,(x) — 1/x| < 1/k for x € [1/k, 1]. Then s(x) —> g(x) for all x € [0, 1]. 
(d) Iff € M{a, }] and if w: [a,b] — Ris such that w(x) = f(x) a.e., then y € Mia, b]. 
For, if f(x) = lim s(x) for x € [a, b]\Z, and if w(x) = f(x) for all x € [a, b]\Zp, then 
w(x) = lim s(x) for all x € [a, b]\(Z, U Z2). Since Z; U Z3 is a null set when Z; and Z; are, 
the conclusion follows. 


The next result shows that elementary combinations of measurable functions lead to 
measurable functions. 


10.4.9 Theorem Let f and g belong to M{a, b| and let c € R. 


(a) Then the functions cf, |f|, f+ g,f — g, and f - g also belong to M{a, b]. 
(b) Ify:R—R is continuous, then the composition g of € M{a, b]. 


(c) If (f,) is a sequence in M{a, b] and f(x) = lim f, (x) almost everywhere on I, then 
f € Mia, b]. 


Proof. (a) We will prove that |f| is measurable. Let Z C [a, b] be a null set such that (14) 
holds. Since |s;| is a step function, the Triangle Inequality implies that 


0 < [F= bse CI] SLAC) — se) > 0 
for all x € [a, b]\Z. Therefore |f| € Mla, b]. 
The other assertions in (a) follow from the basic properties of limits. 


(b) If sis a step function on [a, b], it is easily seen that ọ o s+ is also a step function on 
[a, b]. Since g is continuous on R and f(x) = lim s,(x) for all x € [a, b]\Z, it follows that 
(yo f)(x) = o(f(x)) = lim g(s,(x)) = lim(g o sk) (x) for all x € [a, b]\Z. Therefore yo 
f is measurable. 

(c) This conclusion is not obvious; a proof is outlined in Exercise 14. Q.E.D. 


The next result is that we can replace the step functions in Definition 10.4.7 by 
continuous functions. Since we will use only one part of this result, we content ourselves 
with a sketch of the proof of the other part. 


10.4.10 Theorem A function f : [a,b] — R is in M{a, b] if and only if there exists a 
sequence (gx) of continuous functions such that 


(15) f(x) = jim g(x) for almost every x € [a,b]. 
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Proof. (<=) Let Z C [a, b] be a null set and (g4) be a sequence of continuous functions 
such that f(x) = lim g(x) for x € [a, b]\Z. Since g, is continuous, by 5.4.10 there exists a 
step function sọ such that 


\g-(x) — Se(x)| < 1/k for all x € [a,b]. 


Therefore we have 


0 < [E = se] S IFC) — Be) + | Bel) — se) 
< | F(x) = gi(x)| + 1/k, 


whence it follows that f(x) = lim g(x) for all x € [a, b]\Z. 

Sketch of (=) Let Z be a null set and (s+) be a sequence of step functions such that 
f(x) = lim s(x) for all x € [a, b]\Z. Without loss of generality, we may assume that each 
Sx is continuous at the endpoints a, b. Since s+ is discontinuous at only a finite number of 
points in (a, b), which can be enclosed in a finite union J; of intervals with total length < 
1/k, we can construct a piecewise linear and continuous function g, that coincides with 
sk on [a, b]\Jx. It can be shown that g(x) — f(x) a.e. on I. (See [MTI] for the details.) 

Q.E.D. 


Functions in R*[a, b| are Measurable 


We now show that a generalized Riemann integrable function is measurable. 
10.4.11 Measurability Theorem Jff € R*[a, b], then f € M{a, b]. 


Proof. Let F : |a,b+ 1] — R be the indefinite integral 


F=f f if xeEfļa,b], 


and let F(x) := F(b) for x € (b, b+ 1]. It follows from the Fundamental Theorem 
(Second Form) 10.1.11(a) that F is continuous on [a, b]. From 10.1.11(c), there exists a 
null set Z such that the derivative F’(x) = f(x) exists for x € [a, b]\Z. Therefore, if we 
introduce the difference quotient functions 


Pe F(x+ ees 


for x € [a,b),k EN, 


then g(x) — f(x) for all x € [a, b]\Z. Since the gx are continuous, it follows from the part 
of Theorem 10.4.10 we have proved that f € M[a, b]. QED. 


Are Measurable Functions Integrable? 


Not every measurable function is generalized Riemann integrable. For example, the 
function g(x) := 1/x for x € (0,1] and g(0):= 0 was seen in Example 10.4.8(c) to be 
measurable; however, it is not in R*[a, b] because it is “too large” (as x — 0 +). However, 
if the graph of a measurable function on [a, b] lies between two functions in R*[a, b], then it 
also belongs to R*[a, b]. 
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10.4.12 Integrability Theorem Let f € Mla, b]. Then f € R* [a,b] if and only if there 
exist functions a,w E€ R* [a,b] such that 
(16) a(x) <f(x) < @(x) foralmostevery x€ [a,b]. 


Moreover, if either a or œw belongs to Lia, bj, then f € Lia, b]. 


Proof. (=) This implication is trivial, since one can take œ = w = f. 
(<) Since f € Ma, b], there exists a sequence (sx) of step functions on [a, b] such that 
(13) holds. We define 5, := mid{a, s,,@} for k € N, so that 5,(x) is the middle of the 
numbers a(x), s(x), and w(x) for each x € [a, b]. It follows from Theorem 10.2.8 and the 
facts 
mid{a, b,c} = min{max{a,b}, max{b, c}, max{c,a}}, 
min{a’,b’,c’} = min{min{a', b’},c’}, 


that 5, E€ R*|a,b] and that œ < 5k < œ. Since f = lim sp = lim 5p a.e., the Dominated 
Convergence Theorem now implies that f € R*|a, b]. 

If either a or w belongs to La, b], then we can apply Theorem 10.2.6 to conclude that f 
belongs to Lia, b]. QED. 


A Final Word 


In this chapter we have made frequent reference to Lebesgue integrable functions on an interval 
I, which we have introduced as functions in R* (I) whose absolute value also belongs to R*(/). 
While there is no single “standard approach” to the Lebesgue integral, our approach is very 
different from any that are customary. A critic might say that our approach is not useful because 
our definition of a function in £(I) is not standard, but that would be wrong. 

After all, one seldom uses the definition to confirm that a specific function is Lebesgue 
integrable. Instead, one uses the fact that certain simpler functions (such as step functions, 
polynomials, continuous functions, bounded measurable functions) belong to L(I), and 
that more complicated functions belong to C(I) by taking algebraic combinations or 
various limiting operations (e.g., Hake’s Theorem or the Dominated Convergence Theo- 
rem). A famous analyst once said, “No one ever calculates a Lebesgue integral; instead, 
one calculates Riemann integrals and takes limits.” 

It is the same as with real numbers: we listed certain properties as axioms for R and 
then derived consequences of these properties that enable us to work quite effectively with 
the real numbers, often by taking limits. 


Exercises for Section 10.4 


1. Consider the following sequences of functions with the indicated domains. Does the sequence 
converge? If so, to what? Is the convergence uniform? Is it bounded? If not bounded, is it 
dominated? Is it monotone? Evaluate the limit of the sequence of integrals. 


@ —= 4 o ~~ oJ 
V TFkx E 1+ xk ae 
1 1 
— [0,1 d) —— [0,2]. 
(c) 1 ye xk [ d |; ( ) 1 ma xk [ $ ] 
2. Answer the questions posed in Exercise 1 for the following sequences (when properly defined): 
kx 1 
—— [0,1], b) —— A 
© TEE Ol © Tape OM 
© -p I © s b] 
c i —— yj. 
vx +x) 7 vx(2 = x*) 
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10. 


11. 


12. 


13. 


14. 
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Discuss the following sequences of functions and their integrals on [0, 1]. Evaluate the limit of 
the integrals, when possible. 


a mae b) e™/x, 
(c) kxe ™, (d) kxe ™, 
(e) kxe? , (f) kxe 

1 xkdx l kxřdx i 
(a) Show that lim — 7 = 0. (b) Show that jim => 

k= Jo (1 +x) k>œ Jo L+x 

If f(x) := k for x € [1/k, 2/k] and f(x) := 0 elsewhere on [0, 2], show that f(x) — 0 but that 
J : f=. 


Let (fy) be a sequence on [a, b] such that each fy is differentiable on [a, b] and f(x) > g(x) 
with | if i (x) | < K for all x € [a, b]. Show that the sequence (f;.(x)) either converges for all x € 
[a, b] or it diverges for all x € [a, b]. 


If fx are the functions in Example 10.4.6(a), show that sup{ fy} does not belong to R*[0, co). 


Show directly that fọ e~dx = 1/t and fọ xe~“dx = 1/( for t > 0, thus confirming the 
results in Examples 10.4.6(d, e) when f(x) := 1. 


Use the differentiation formula in 10.4.6(f) to obtain fọ e~® sinx dx = 1/(? + 1). 


If t > 0, define E(t) := fọ [(e7®sin x) /x]dx. 
(a) Show that E We ane is continuous for t > a > 0. Moreover, E(t) — 0 as t > co. 


ð fe ™sinx 
Ot x 


(c) Deduce that E(t) = 4x — Arctan t for t > 0. 
(d) Explain why we cannot use the formula in (c) to obtain equation (12). 


(b) Since < e™ for t > a > 0, show that E’(t) = for t > 0. 


—1 
P+1 


In this exercise we will establish the important formula: 
9 2 
(17) T e™ dx =}yn. 
0 


(a) Let G(t = file [e-P +) /(x2 + 1)]dx for t > 0. Since the integrand is dominated by 
1/0 + i for t > 0, then n is continuous on [0, o0). Moreover, G(0) = Arctan 1 = tr and 
it follows from the Dominated Convergence Theorem that G(t) — 0 as t — œ. 

(b) The partial derivative of the integrand with respect to ¢ is bounded for t > 0, x € [0,1], so 
G' (t) = —2te-? fo e$ dx = —2e® J edu. 

(c) If we set F(t a foe e* a then the Fundamental Theorem 10.1.11 yields F’(t) = 

Qe" h er - for t > 0, from which F’(t) + G'(t) = 0 for all t > 0. Therefore, F(t) + G(t) 

= C for all t > 0. 

(d) Using F(0) = 0, G(0) = ix and lim,_,..G(t) = 0, we conclude that lim, „œ F(t) = ix, so 
that formula (17) holds. 


Suppose / C R is a closed interval and that f : [a,b] x Z — R is such that Of/0t exists on [a, b] x 

I, and for each ¢ € [a, b] the function x + f(t, x) is in R*(/) and there exist æ, w € R*(I) such 

oe the partial derivative satisfies a(x) < Ət, x)/Ot < w(x) = ae. x E€ I. If 

F(t) := f/f (t,x)dx, show that F is differentiable on [a, b] and that F’(t) = f ð f(t, x)/Ot dx. 

- tf g € Mia, b], show that max{ f, g} and min{ f, g} belong to hee 

(b) Iff, g,h e M[a, b], show that mid{ f, g, h} € Mla, b]. 

(a) If (fx) isa bounded sequence in M[a, b] and fy > f a.e., show that f € M|[a, b]. [Hint: Use 
the Dominated Convergence Theorem.] 

(b) If (g,) is any sequence in M|a, b] and if f} := Arctan o g}, show that (f) is a bounded 
sequence in M{a, b]. 

(c) If (gx) is a sequence in M[a, b] and if gy > g a.e., show that g € Mla, b]. 
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15. A set Ein [a, b] is said to be (Lebesgue) measurable if its characteristic function 1; (defined by 
1,(x) := 1 if x € E and 1,(x) := 0 if x € [a, b]\E) belongs to M|a, b]. We will denote the 
collection of measurable sets in [a, b] by Mla, b]. In this exercise, we develop a number of 
properties of M{a, b]. 

(a) Show that E € Mia, b] if and only if 1, belongs to R*[a, b]. 

(b) Show that Ø € Mla, b] and that if [c, d] C [a, b], then the intervals [c, d], [c, d), (c, d], and 
(c, d) are in M{a, b]. 

(c) Show that E € M{a, b] if and only if E’:= [a, b]\E is in Mla, b]. 

(d) If £and F are in M[a, b], then EU F, E N F and E\F are also in M{a, b]. [Hint: Show that 
leur = max{1,, 17}, etc.] 

(e) If (Ep is an increasing sequence in M[a, b], show that E := UX Ex is in M[a, b]. Also, if 
(Fy) is a decreasing sequence in M[a, b], show that F := Ng, Fx is in Mla, b]. [Hint: Apply 
Theorem 10.4.9(c).] 

(f) If (E,) is any sequence in M[a, b], show that UX Eg and NX Ex are in Mja, b]. 


16. IfE € Mia, b], we define the (Lebesgue) measure of E to be the number m(E) := ie 1z. In this 

exercise, we develop a number of properties of the measure function m : Mia, b] — R. 

(a) Show that m(Ø) = 0 and 0 < m(E) < b — a. 

(b) Show that m([c, d]) = m([c, d)) = m((c, d]) = m((c, d)) = d — c. 

(c) Show that m(E’) = (b — a) — m(E). 

(d) Show that m(E U F) + m(E N F) = mE) + m(F). 

(e) IfENF=9, show that m(E U F) = m(E) + m(F). (This is the additivity property of the 
measure function.) 

(f) If (E;,) is an increasing sequence in M{a, b], show that m (Ug. Ex) = lim; (Ex). [Hint: Use 
the Monotone Convergence Theorem. ] 

(g) If (C,) is a sequence in M[a, b] that is pairwise disjoint (in the sense that C; N Cp = ) 
whenever j 4 k), show that 


(18) „(Ù cı) SS O 
k=1 


k=1 


(This is the countable additivity property of the measure function.) 


CHAPTER 11 


A GLIMPSE INTO TOPOLOGY 


For the most part, we have considered only functions that were defined on intervals. Indeed, 
for certain important results on continuous functions, the intervals were also assumed to be 
closed and bounded. We shall now examine functions defined on more general types of 
sets, with the goal of establishing certain important properties of continuous functions 
in a more general setting. For example, we proved in Section 5.3 that a function that is 
continuous on a closed and bounded interval attains a maximum value. However, we will 
see that the hypothesis that the set is an interval is not essential, and in the proper context it 
can be dropped. 

In Section 11.1 we define the notions of an open set and a closed set. The study of open 
sets and the concepts that can be defined in terms of open sets is the study of point-set 
topology, so we are in fact discussing certain aspects of the topology of R. (The 
mathematical area called “topology” is very abstract and goes far beyond the study of 
the real line, but the key ideas are to be found in real analysis. In fact, it is the study of 
continuous functions on R that motivated many of the concepts developed in topology.) 

The notion of compact set is defined in Section 11.2 in terms of open coverings. In 
advanced analysis, compactness is a powerful and widely used concept. The compact 
subsets of IR are fully characterized by the Heine-Borel Theorem, so the full strength of the 
idea is not as apparent as it would be in more general settings. Nevertheless, as we establish 
the basic properties of continuous functions on compact sets in Section 11.3, the reader 
should begin to appreciate how compactness arguments are used. 

In Section 11.4 we take the essential features of distance on the real line and introduce 
a generalization of distance called a “metric.” The much-used triangle inequality is the key 
property in this general concept of distance. We present examples and show how theorems 
on the real line can be extended to the context of a metric space. 

The ideas in this chapter are somewhat more abstract than those in earlier chapters; 
however, abstraction can often lead to a deeper and more refined understanding. In this 
case, it leads to a more general setting for the study of analysis. 


Section 11.1 Open and Closed Sets in R 


There are special types of sets that play a distinguished role in analysis—these are the open 
and closed sets in R. To expedite the discussion, it is convenient to have an extended notion 
of a neighborhood of a point. 


11.1.1 Definition A neighborhood of a point x€ R is any set V that contains an 
e-neighborhood V,(x) := (x — £, x + £) of x for some e > 0. 


While an -neighborhood of a point is required to be “symmetric about the point,” the 
idea of a (general) neighborhood relaxes this particular feature, but often serves the same 


purpose. 
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11.1.2 Definition (i) A subset G of R is open in R if for each x € G there exists a 
neighborhood V of x such that V C G. 
(ii) A subset F of R is closed in R if the complement C(F) := R\F is open in R. 


To show that a set G C R is open, it suffices to show that each point in G has an 
é-neighborhood contained in G. In fact, G is open if and only if for each x € G, there exists 
& > 0 such that (x — ex, X + &,) is contained in G. 

To show that a set F C R is closed, it suffices to show that each point y ¢ F has an 
é-neighborhood disjoint from F. In fact, F is closed if and only if for each y ¢ F there 
exists £, > 0 such that FN (y — ey, y +e) =0. 


11.1.3 Examples (a) The entire set R = (—oo, 00) is open. 

For any x € R, we may take e := 1. 

(b) The set G:= {xE R:0< x< 1} is open. 

For any x € G we may take e, to be the smaller of the numbers x, 1 — x. We leave it 
to the reader to show that if |u — x| < ex then u € G. 

(c) Any open interval J := (a,b) is an open set. 

In fact, if x € J, we can take €x to be the smaller of the numbers x — a, b — x. The 
reader can then show that (x — £x, x + £x) C I. Similarly, the intervals (—oo, b) and (a, o0) 
are open sets. 

(d) The set J := [0, 1] is not open. 

This follows since every neighborhood of 0 € J contains points not in J. 
(e) The set J := [0, 1] is closed. 

To see this let y ¢ I; then either y < 0 ory > 1. If y < 0, we take £y := |y|, and if y > 1 
we take £, := y—1. We leave it to the reader to show that in either case we have 
INQ—&, y+s) = l. 

(£) The set H := {x:0< x < 1} is neither open nor closed. (Why?) 
(g) The empty set Ý is open in R. 

In fact, the empty set contains no points at all, so the requirement in Definition 11.1.2(4) is 
vacuously satisfied. The empty set is also closed since its complement R is open, as was seen in 
part (a). 


In ordinary parlance, when applied to doors, windows, and minds, the words “‘open”’ 
and “‘closed”’ are antonyms. However, when applied to subsets of R, these words are not 
antonyms. For example, we noted above that the sets Ø, R are both open and closed in R. 
(The reader will probably be relieved to learn that there are no other subsets of R that have 
both properties.) In addition, there are many subsets of R that are neither open nor closed; 
in fact, most subsets of IR have this neutral character. 


The following basic result describes the manner in which open sets relate to the 
operations of the union and intersection of sets in R. 


11.1.4 Open Set Properties (a) The union of an arbitrary collection of open subsets 
in R is open. 
(b) The intersection of any finite collection of open sets in R is open. 


Proof. (a) Let {G, : à € A} be a family of sets in R that are open, and let G be their 
union. Consider an element x € G; by the definition of union, x must belong to G}, for 
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some Ao € A. Since G}, is open, there exists a neighborhood V of x such that V C G,,. 
But G,, C G, so that V C G. Since x is an arbitrary element of G, we conclude that G is 
open in R. 

(b) Suppose G; and G, are open and let G := Gi N G2. To show that G is open, 
we consider any x € G; then x € G; and x € Gp. Since G4 is open, there exists ¢; > 0 such 
that (x — £1, x + £1) is contained in G4. Similarly, since Gz is open, there exists & > 0 
such that(x — &, x + £2) is contained in G2. If we now take ¢ to be the smaller of £; and £2, 
then the e-neighborhood U := (x — £, x + €) satisfies both U C G; and U C G2. Thus, 
x € U C G. Since x is an arbitrary element of G, we conclude that G is open in R. 

It now follows by an Induction argument (which we leave to the reader to write out) 
that the intersection of any finite collection of open sets is open. Q.E.D. 


The corresponding properties for closed sets will be established by using the general 
De Morgan identities for sets and their components. (See Theorem 1.1.4.) 


11.1.5 Closed Set Properties (a) The intersection of an arbitrary collection of closed 
sets in R is closed. 


(b) The union of any finite collection of closed sets in R is closed. 


Proof. (a) If {F, : A € A} is a family of closed sets in R and F := () F,, then C(F) = 
ACA 
U C(F,) is the union of open sets. Hence, C(F) is open by Theorem 11.1.4(a), and 
AEA 
consequently, F is closed. 
(b) Suppose F1, F2,..., Ff, are closed in R and let F := F; U F2 U---UF,. By the 
De Morgan identity the complement of F is given by 


C(F) = C(F) N + N CF). 


Since each set C(F;) is open, it follows from Theorem 11.1.4(b) that C(F) is open. Hence F 
is closed. Q.E.D. 


The finiteness restrictions in 11.1.4(b) and 11.1.5(b) cannot be removed. Consider the 
following examples: 


11.1.6 Examples (a) Let G, := (0, 1 + 1/n) for n € N. Then G, is open for each 


n € N, by Example 11.1.3(c). However, the intersection G := A Gn is the interval (0, 1], 
which is not open. Thus, the intersection of infinitely many iga. RAA in R need not be open. 
(b) Let F, := [1/n, 1] for n € N. Each F, is closed, but the union F := U F, is the set 
(0, 1] which is not closed. Thus, the union of infinitely many closed sets inR need not be 
closed. 


The Characterization of Closed Sets 


We shall now give a characterization of closed subsets of R in terms of sequences. As we 
shall see, closed sets are precisely those sets F that contain the limits of all convergent 
sequences whose elements are taken from F. 


11.1.7 Characterization of Closed Sets Let F C R; then the following assertions are 
equivalent. 
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(i) Fis a closed subset of R. 
(ii) If X = (xn) is any convergent sequence of elements in F, then lim X belongs to F. 


Proof. (i) = (ii) Let X = (xn) be a sequence of elements in F and let x := lim X; we 
wish to show that x € F. Suppose, on the contrary, that x ¢ F; that is, that x € C(F) the 
complement of F. Since C(F) is open and x € C(F), it follows that there exists an 
é-neighborhood V, of x such that V, is contained in C(F). Since x = lim(x,), it follows that 
there exists a natural number K = K(e) such that xx € V,. Therefore we must have 
xx E€ C(F); but this contradicts the assumption that x, € F for all n € N. Therefore, we 
conclude that x € F. 

(ii) > (i) Suppose, on the contrary, that F is not closed, so that G := C(F) is not open. 
Then there exists a point yọ € G such that for each n € N, there is a number y„ € C(G) = F 
such that |y, — yo| < 1/n. It follows that yọ := lim(y,„), and since y, € F for all n € N, the 
hypothesis (ii) implies that yọ € F, contrary to the assumption yọ € G = C(F). Thus the 
hypothesis that F is not closed implies that (ii) is not true. Consequently (ii) implies (i), as 


asserted. 
Q.E.D. 


The next result is closely related to the preceding theorem. It states that a set F is 
closed if and only if it contains all of its cluster points. Recall from Section 4.1 that a point x 
is a cluster point of a set F if every -neighborhood of x contains a point of F different from 
x. Since by Theorem 4.1.2 each cluster point of a set F is the limit of a sequence of points in 
F, the result follows immediately from Theorem 11.1.7 above. We provide a second proof 
that uses only the relevant definitions. 


11.1.8 Theorem A subset of R is closed if and only if it contains all of its cluster points. 


Proof. Let F be a closed set in R and let x be a cluster point of F; we will show that x € F. 
If not, then x belongs to the open set C(F). Therefore there exists an -neighborhood V, of x 
such that V, C C(F). Consequently V, N F = 9, which contradicts the assumption that x is 
a cluster point of F. 

Conversely, let F be a subset of R that contains all of its cluster points; we will show 
that C(F) is open. For if y € C(F), then y is not a cluster point of F. It follows that there 
exists an -neighborhood V, of y that does not contain a point of F (except possibly y). But 
since y € C(F), it follows that V, C C(F). Since y is an arbitrary element of C(F), we 
deduce that for every point in C(F) there is an -neighborhood that is entirely contained in 
C(F). But this means that C(F) is open in R. Therefore F is closed in R. QED. 


The Characterization of Open Sets 


The idea of an open set in R is a generalization of the notion of an open interval. That this 
generalization does not lead to extremely exotic sets that are open is revealed by the next 
result. 


11.1.9 Theorem A subset of R is open if and only if it is the union of countably many 
disjoint open intervals in R. 


Proof. Suppose that G # @ is an open set in R. For each x € G, let Ay := {aE R: 
(a, x] C G} and let By := {b € R : |x, b) C G}. Since G is open, it follows that A, and By 
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are not empty. (Why?) If the set A, is bounded below, we set a, := inf Ax; if A, is not 


bounded below, we set ay := —oo. Note that in either case a, ¢ G. If the set By is bounded 
above, we set by := sup Bx; if By is not bounded above, we set by := oo. Note that in either 
case bx É G. 


We now define I, := (ax, bx); clearly Iy is an open interval containing x. We claim 
that J, C G. To see this, let y € Zy and suppose that y < x. It follows from the definition of 
ax that there exists a’ € A, with a’ < y, whence y € (a', x] C G. Similarly, if y € Zx and 
x < y, there exists b’ € By with y < b', whence it follows that y € [x, b’) C G. Since y € Iy 
is arbitrary, we have that Iy C G. 


Since x € Gis arbitrary, we conclude that J Zy C G. On the other hand, since for each 
xEG 
x € G there is an open interval Z, with x € I, C G, we also have G C | Iy. Therefore we 


xEG 
conclude that G = [J Ix. 
xEG 
We claim that if x, y € G and x # y, then either Iy = Iy or I% N Iy = 0. To prove this 


suppose that z € Zx N Zy, whence it follows that a, < z < b, and ay < z < bx. (Why?) We 
will show that a, = ay. If not, it follows from the Trichotomy Property that either 
(i) dy < dy, or (ii) ay < ax. In case (i), then ay € Iy = (ax, by) C G, which contradicts 
the fact that a, ¢ G. Similarly, in case (ii), then a, € I y= (ay, by) C G, which contradicts 
the fact that ax ¢ G. Therefore we must have a, = ay and a similar argument implies that 
bx = b,. Therefore, we conclude that if Iy N I, # 0, then 1, = J. 

It remains to show that the collection of distinct intervals {Zy : x € G} is countable. To 
do this, we enumerate the set Q of rational numbers Q = {r1,r2,...,1n,.--} (see Theorem 
1.3.11). It follows from the Density Theorem 2.4.8 that each interval Z, contains rational 
numbers; we select the rational number in 7, that has the smallest index n in this enumeration 
of Q. That is, we choose n(x) E Q such that L = Tx and n(x) is the smallest index n 
such that /,,, = Zx. Thus the set of distinct intervals Zy, x € G, is put into correspondence with 
a subset of N. Hence this set of distinct intervals is countable. Q.E.D. 


It is left as an exercise to show that the representation of G as a disjoint union of open 
intervals is uniquely determined. 


It does not follow from the preceding theorem that a subset of R is closed if and only if 
it is the intersection of a countable collection of closed intervals (why not?). In fact, there 
are closed sets in R that cannot be expressed as the intersection of a countable collection of 
closed intervals in R. A set consisting of two points is one example. (Why?) We will now 
describe the construction of a much more interesting example called the Cantor set. 


The Cantor Set 


The Cantor set, which we will denote by F, is a very interesting example of a (somewhat 
complicated) set that is unlike any set we have seen up to this point. It reveals how 
inadequate our intuition can sometimes be in trying to picture subsets of R. 

The Cantor set F can be described by removing a sequence of open intervals from the 
closed unit interval J := [0, 1]. We first remove the open middle third (G 2) of [0, 1] to 


37 3 
obtain the set 
Fi:= [o, 4 U p i]. 


We next remove the open middle third of each of the two closed intervals in F; to obtain 


the set 
ES 1 oF I zg 8 
Fz := [o, J U k 4 U B | U E i}; 
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We see that F» is the union of 2? = 4 closed intervals, each of which is of the form 
[k/3°, (k + 1)/37]. We next remove the open middle thirds of each of these sets to get F3, 
which is the union of 2° = 8 closed intervals. We continue in this way. In general, if F„ has 
been constructed and consists of the union of 2” intervals of the form [k/3”, (k + 1)/3”], 
then we obtain the set F,,,, by removing the open middle third of each of these intervals. 
The Cantor set F is what remains after this process has been carried out for every n € N. 
(See Figure 11.1.1.) 


0 1 
ELE —<§$ @__$_$_ eee 


Fy —_— e a 


F; ee 


F4 


Figure 11.1.1 Construction of the Cantor set 


11.1.10 Definition The Cantor set F is the intersection of the sets F,,, n € N, obtained 
by successive removal of open middle thirds, starting with [0, 1]. 


Since it is the intersection of closed sets, F is itself a closed set by 11.1.5(a). We now 
list some of the properties of F that make it such an interesting set. 


(1) The total length of the removed intervals is 1. 

We note that the first middle third has length 1/3, the next two middle thirds have 
lengths that add up to 2/3°, the next four middle thirds have lengths that add up to 2? / 3°, 
and so on. The total length L of the removed intervals is given by 


Using the formula for the sum of a geometric series, we obtain 


L ; : 1 
= 3 1= (2/3) ` 
Thus F is a subset of the unit interval [0, 1] whose complement in [0, 1] has total length 1. 

Note also that the total length of the intervals that make up F, is (2/3)”, which has limit 
Oas n — oo. Since F C F, for all n € N, we see that if F can be said to have “length,” it 
must have length 0. 

(2) The set F contains no nonempty open interval as a subset. 

Indeed, if F contains a nonempty open interval J := (a, b), then since J C F, for all 
n € N, we must have 0 < b — a < (2/3)" forall n € N. Therefore b — a = 0, whence J is 
empty, a contradiction. 

(3) The Cantor set F has infinitely (even uncountably) many points. 

The Cantor set contains all of the endpoints of the removed open intervals, and these 
are all points of the form 2*/3" where k = 0, 1,...,2 for each n € N. There are infinitely 
many points of this form. 

The Cantor set actually contains many more points than those of the form af 13": 
in fact, F is an uncountable set. We give an outline of the argument. We note that each 
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x € [0, 1] can be written in a ternary (base 3) expansion 


CO 
a 
x= SOF = (aman) 


n=1 


= 


io) 


where each a, is either 0 or 1 or 2. (See the discussion at the end of Section 2.5.) Indeed, 
each x that lies in one of the removed open intervals has a, = 1 for some n; for example, 
each point in (4 : 2) has a, = 1. The endpoints of the removed intervals have two possible 
ternary expansions, one having no 1s; for example, 3 = (.100---), = (.022---),. If we 
choose the expansion without 1s for these points, then F consists of all x € [0, 1] that have 
ternary expansions with no 1s; that is, a, is 0 or 2 for all n € N. We now define a mapping g 


of F onto [0, 1] as follows: 


(9) = @?) for x eF. 


That is, o((.a1a . -)) = (.b|b2---), where ba = a,/2 for all n€ N and (.b,52---), 
denotes the binary representation of a number. Thus ¢ is a surjection of F onto [0, 1]. 
Assuming that F is countable, Theorem 1.3.10 implies that there exists a surjection y of N 
onto F, so that g o w is a surjection of N onto [0, 1]. Another application of Theorem 
1.3.10 implies that [0, 1] is a countable set, which contradicts Theorem 2.5.5. Therefore F 
is an uncountable set. 


Exercises for Section 11.1 


1. If x € (0,1), let x be as in Example 11.1.3(b). Show that if |u — x| < ex, then u € (0, 1). 


2. Show that the intervals (a, 00) and (—oo, a) are open sets, and that the intervals [b, 00) and 
(—oo, b] are closed sets. 


Write out the Induction argument in the proof of part (b) of the Open Set Properties 11.1.4. 
Prove that (0, 1] = N% (0, 1+ 1/n), as asserted in Example 11.1.6(a). 

Show that the set N of natural numbers is a closed set in R. 

Show that A = {1/n : n € N} is not a closed set, but that A U {0} is a closed set. 


Show that the set Q of rational numbers is neither open nor closed. 


COR ON Ae es 


Show that if G is an open set and F is a closed set, then G\F is an open set and F\G is a 
closed set. 


9. A point x € R is said to be an interior point of A C R in case there is a neighborhood V of x 
such that V C A. Show that a set A C R is open if and only if every point of A is an interior 
point of A. 


10. A point x € R is said to be a boundary point of A C R in case every neighborhood V of x 
contains points in A and points in C(A). Show that a set A and its complement C(A) have exactly 
the same boundary points. 


11. Show that a set G C R is open if and only if it does not contain any of its boundary points. 
12. Show that a set F C R is closed if and only if it contains all of its boundary points. 


13. If A CR, let A? be the union of all open sets that are contained in A; the set A° is called the 
interior of A. Show that A° is an open set, that it is the largest open set contained in A, and that a 
point z belongs to A° if and only if z is an interior point of A. 
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14. Using the notation of the preceding exercise, let A, B be sets in R. Show that A° C A, (A°)° = 
A®, and that (A N B)° = A° N B°. Show also that A° U B° C (A UB)’, and give an example to 
show that the inclusion may be proper. 


15. If ACR, let A` be the intersection of all closed sets containing A; the set A~ is called the 
closure of A. Show that A” is a closed set, that it is the smallest closed set containing A, and that 
a point w belongs to A” if and only if w is either an interior point or a boundary point of A. 


16. Using the notation of the preceding exercise, let A, B be sets in R. Show that we have A C 
A~,(A~) = A`, and that (AUB) = A~ UB’. Show that (AM B) C A` N B7, and give an 
example to show that the inclusion may be proper. 

17. Give an example of a set A C R such that A° = § and A` = R. 

18. Show that if F C R is a closed nonempty set that is bounded above, then sup F belongs to F. 

19. If Gis open and x € G, show that the sets A, and B, in the proof of Theorem 11.1.9 are not 
empty. 

20. If the set A, in the proof of Theorem 11.1.9 is bounded below, show that ay := inf A, does not 
belong to G. 

21. If in the notation used in the proof of Theorem 11.1.9, we have a, < y < x, show that y € G. 


22. If in the notation used in the proof of Theorem 11.1.9, we have J, N I, # Ø, show that bx = by. 


23. Show that each point of the Cantor set F is a cluster point of F. 


24. Show that each point of the Cantor set F is a cluster point of C(F). 


Section 11.2 Compact Sets 


In advanced analysis and topology, the notion of a “compact” set is of enormous 
importance. This is less true in R because the Heine-Borel Theorem gives a very simple 
characterization of compact sets in R. Nevertheless, the definition and the techniques 
used in connection with compactness are very important, and the real line provides an 
appropriate place to see the idea of compactness for the first time. 

The definition of compactness uses the notion of an open cover, which we now define. 


11.2.1 Definition Let A be a subset of R. An open cover of A is a collection G = {Gy} of 
open sets in R whose union contains A; that is, 


AC|JGo. 


If G’ is a subcollection of sets from G such that the union of the sets in G’ also contains A, 
then G’ is called a subcover of G. If G’ consists of finitely many sets, then we call G’ a finite 
subcover of G. 


There can be many different open covers for a given set. For example, if A := [1, 00), 
then the reader can verify that the following collections of sets are all open covers of A: 
Go = 0, oo), 
Gi :={(r-1,r+1):reQr> oO}, 


G3 i= 


{( 
{( 
G := {(n—1,n+1):nEN}, 
{( 
Ga := {( 
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We note that G2 is a subcover of G1, and that G4 is a subcover of G3. Of course, many other 
open covers of A can be described. 


11.2.2 Definition A subset K of R is said to be compact if every open cover of K has a 
finite subcover. 


In other words, a set K is compact if, whenever it is contained in the union of a 
collection G = {G,} of open sets in R, then it is contained in the union of some finite 
number of sets in G. 


It is very important to note that, in order to apply the definition to prove that a set K is 
compact, we must examine an arbitrary collection of open sets whose union contains K, 
and show that K is contained in the union of some finite number of sets in the given 
collection. That is, it must be shown that any open cover of K has a finite subcover. On the 
other hand, to prove that a set H is not compact, it is sufficient to exhibit one specific 
collection G of open sets whose union contains H, but such that the union of any finite 
number of sets in G fails to contain H. That is, H is not compact if there exists some open 
cover of H that has no finite subcover. 


11.2.3 Examples (a) Let K := {x,, X2,...,X,} bea finite subset of R. If G = {G,} is 
an open cover of K, then each x; is contained in some set Ga, in G. Then the union of the sets 
in the collection {Gy,, Gon, - - - , Ga, } contains K, so that it is a finite subcover of G. Since G 
was arbitrary, it follows that the finite set K is compact. 


(b) Let H := [0, o0). To prove that H is not compact, we will exhibit an open cover that 

has no finite subcover. If we let G, := (—1,n) for each n € N, then H C | Gn, so that 
n=1 

G := {Gn : n € N} is an open cover of H. However, if {Gy,,Gy,,..-,Gy,} is any finite 

subcollection of G, and if we let m := sup{n;, m, ..., mx}, then 


Gn, U Gn, U ++- U Gn, = Gm = (—1, m). 


Evidently, this union fails to contain H = [0, o0). Thus no finite subcollection of G will 
have its union contain H, and therefore H is not compact. 


(c) Let J := (0,1). If we let G, := (1/n, 1) for each n € N, then it is readily seen that 
J = U Gn. Thus G := {G,, : n € N} is an open cover of J. If {Gn , Gm,- - <, Gn, } is any 


n=1 
finite subcollection of G, and if we set s := sup{m,m,...,n,} then 


Gn, U Gy, U ++- U Gy, = G, = (1/s, 1). 


Since 1/s is in J but not in G,, we see that the union does not contain J. Therefore, J is 
not compact. 


We now wish to describe all compact subsets of R. First we will establish by rather 
straightforward arguments that any compact set in R must be both closed and bounded. 
Then we will show that these properties in fact characterize the compact sets in R. This is 
the content of the Heine-Borel Theorem. 


11.2.4 Theorem /f K is a compact subset of R, then K is closed and bounded. 
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Proof. We will first show that K is bounded. For each m € N, let Hm := (—m, m). Since 
each H is open and since K C U Hm = R, we see that the collection {H : m € N} isan 
open cover of K. Since K is compact, this collection has a finite subcover, so there exists 
M € N such that 


M 
KC U Hm = Hm = (—M, M). 


Therefore K is bounded, since it is contained in the bounded interval (~M, M). 
We now show that K is closed, by showing that its complement C(K) is open. To do so, 
letu € C(K) be arbitrary and for each n € N, we let G, := {y € R : |y — u| > 1/n}. Itis an 


exercise to show that each set G, is open and that R\{u} = U Gn. Since u € K, we have 


n=1 


CO 
K C | Gn. Since K is compact, there exists m € N such that 
n=1 
m 
KC U Gn a Gn. 
n=1 


Now it follows from this that K N (u — 1/m, u + 1/m) = 0, so that the interval (u — 1/m, 
u+1/mCC(K). But since u was an arbitrary point in C(K), we infer that C(K) 
is open. QED. 


We now prove that the conditions of Theorem 11.2.4 are both necessary and sufficient 
for a subset of R to be compact. 


11.2.5 Heine-Borel Theorem A subset K of R is compact if and only if it is closed and 
bounded. 


Proof. We have shown in Theorem 11.2.4 that a compact set in R must be closed and 
bounded. To establish the converse, suppose that K is closed and bounded, and let G = 
{Ga } be an open cover of K. We wish to show that K must be contained in the union of some 
finite subcollection from G. The proof will be by contradiction. We assume that: 


(1) Kis not contained in the union of any finite number of sets inG. 


By hypothesis, K is bounded, so there exists r > 0 such that K C [-r, r]. We let Z, := 
[—r, r] and bisect J; into two closed subintervals 71 := [—r, 0] and 7f := [0, r]. At least one 
of the two subsets K N T} and K N 7] must be nonvoid and have the property that it is not 
contained in the union of any finite number of sets in G. [For if both of the sets K N 1 and 
K A I{ are contained in the union of some finite number of sets in G, then K = (K N r) U 
(K N ri) is contained in the union of some finite number of sets in G, contrary to the 
assumption (1).] If K N I is not contained in the union of some finite number of sets in G, 
we let J, := I|; otherwise K N 77 has this property and we let I, := T}. 

We now bisect J into two closed subintervals 7} and 7%. If K N I} is nonvoid and is not 
contained in the union of some finite number of sets in G, we let J; := I; otherwise K NI if 
has this property and we let J; := 15. 

Continuing this process, we obtain a nested sequence of intervals (/,,). By the Nested 
Intervals Property 2.5.2, there is a point z that belongs to all of the /,,,n € N. Since each 
interval J,, contains infinitely many points in K (why?), the point z is a cluster point of K. 
Moreover, since K is assumed to be closed, it follows from Theorem 11.1.8 that z € K. 
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Therefore there exists a set G, in G with z € G,. Since G, is open, there exists € > 0 such 
that 


(z—6, z+6) CG. 


On the other hand, since the intervals /,, are obtained by repeated bisections of J; = [—r, r], 
the length of J, is r/2” 7. It follows that if n is so large that r/2"7? < e, then 
I, C (zg —6,z +6) C G. But this means that if n is such that r/2"72 < e, then KOI, 
is contained in the single set G, in G, contrary to our construction of J,,. This contradiction 
shows that the assumption (1) that the closed bounded set K requires an infinite number 
of sets in G to cover it is untenable. We conclude that K is compact. Q.E.D. 


Remark It was seen in Example 11.2.3(b) that the closed set H := [0, 00) is not 
compact; note that H is not bounded. It was also seen in Example 11.2.3(c) that the 
bounded set J := (0, 1) is not compact; note that J is not closed. Thus, we cannot drop 
either hypothesis of the Heine-Borel Theorem. 


We can combine the Heine-Borel Theorem with the Bolzano-Weierstrass Theorem 
3.4.8 to obtain a sequential characterization of the compact subsets of R. 


11.2.6 Theorem A subset K of R is compact if and only if every sequence in K has a 
subsequence that converges to a point in K. 


Proof. Suppose that K is compact and let (x,,) be a sequence with x, € K for alln € N. By 
the Heine-Borel Theorem, the set K is bounded so that the sequence (x,,) is bounded; by the 
Bolzano-Weierstrass Theorem 3.4.8, there exists a subsequence (xp, ) that converges. Since 
K is closed (by Theorem 11.2.4), the limit x := lim(x,,) is in K. Thus every sequence in K 
has a subsequence that converges to a point of K. 

To establish the converse, we will show that if K is either not closed or not bounded, 
then there must exist a sequence in K that has no subsequence converging to a point of K. 
First, if K is not closed, then there is a cluster point c of K that does not belong to K. Since c 
is a cluster point of K, there is a sequence (x, with x, € K and x, # c for all n € N such 
that lim(x,) = c. Then every subsequence of (x,) also converges to c, and since c ¢ K, 
there is no subsequence that converges to a point of K. 

Second, if K is not bounded, then there exists a sequence (x,,) in K such that |x,| > n 
for all n € N. (Why?) Then every subsequence of (x,,) is unbounded, so that no subsequence 
of it can converge to a point of K. Q.E.D. 


Remark The reader has probably noticed that there is a similarity between the com- 
pactness of the interval [a, b] and the existence of ô-fine partitions for [a, b]. In fact, these 
properties are equivalent, each being deducible from the other. However, compactness 
applies to sets that are more general than intervals. 


Exercises for Section 11.2 


1. Exhibit an open cover of the interval (1, 2] that has no finite subcover. 
2. Exhibit an open cover of N that has no finite subcover. 


3. Exhibit an open cover of the set {1/n : n € N} that has no finite subcover. 
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4. Prove, using Definition 11.2.2, that if F is a closed subset of a compact set K in R, then F is 
compact. 


5. Prove, using Definition 11.2.2, that if Kı and K, are compact sets in R, then their union Kı U K2 
is compact. 


6. Use the Heine-Borel Theorem to prove the following version of the Bolzano-Weierstrass 
Theorem: Every bounded infinite subset of R has a cluster point in R. (Note that if a set has no 
cluster points, then it is closed by Theorem 11.1.8.) 


[9.0] 
7. Find an infinite collection {K, : n € N} of compact sets in R such that the union |) K, is not 
compact. ael 


8. Prove that the intersection of an arbitrary collection of compact sets in R is compact. 


9. Let (Kn :n € N) be a sequence of nonempty compact sets in R such that K; D K2 D---D 
K, D ---. Prove that there exists at least one point x € R such that x € K, for all n € N; that 


(,0] 
is, the intersection (] K, is not empty. 
n=1 
10. Let K 4 be a compact set in R. Show that inf K and sup K exist and belong to K. 
11. Let K Ú be compact in R and let c € R. Prove that there exists a point a in K such that 
|e —a| = inf{|c — x| : x € K}. 


12. Let K Æ Ú be compact in R and let c € R. Prove that there exists a point b in K such that 
|e — b| = sup{|c — x| : x € K}. 


13. Use the notion of compactness to give an alternative proof of Exercise 5.3.18. 


14. If Kı and K, are disjoint nonempty compact sets, show that there exist k; € K; such that 
0< |ky ka| = inf{ |x x| BE F Kj}. 


15. Give an example of disjoint closed sets F,, F> such that 0 = inf{|x1 — x2| : x; € Fi}. 


Section 11.3 Continuous Functions 


In this section we will examine the way in which the concept of continuity of functions 
can be related to the topological ideas of open sets and compact sets. Some of the 
fundamental properties of continuous functions on intervals presented in Section 5.3 
will be established in this context. Among other things, these new arguments will 
show that the concept of continuity and many of its important properties can be 
carried to a greater level of abstraction. This will be discussed briefly in the next 
section on metric spaces. 


Continuity 


In Section 5.1 we were concerned with continuity at a point, that is, with the “local” 
continuity of functions. We will now be mainly concerned with “global” continuity in the 
sense that we will assume that the functions are continuous on their entire domains. 

The continuity of a function f : A — R at a point c € A was defined in Section 5.1. 
Theorem 5.1.2 stated that f is continuous at c if and only if for every ¢-neighbornood 
V.(f(c)) of f(c) there exists a 6-neighborhood V5(c) of c such that if x € Vs(c) N A, then 
f(x) € V.(f(c)). We wish to restate this condition for continuity at a point in terms of 
general neighborhoods. (Recall from 11.1.1 that a neighborhood of a point c is any set U 
that contains an -neighborhood of c for some e > 0.) 
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11.3.1 Lemma A function f : A — R is continuous at the point c in A if and only if for 
every neighborhood U of f(c), there exists a neighborhood V of c such that if x € VAA, 
then f(x) € U. 


Proof. Suppose f satisfies the stated condition. Then given € > 0, we let U = V,(f(c)) 
and then obtain a neighborhood V for which x € V NA implies f(x) € U. If we choose 
ô > Osuch that Vs(c) C V, then x € V3(c) NA implies f(x) € U; therefore fis continuous 
at c according to Theorem 5.1.2. 

Conversely, if f is continuous at c in the sense of Theorem 5.1.2, then since any 
neighborhood U of f(c) contains an ¢-neighborhood V,(f(c)), it follows that taking the 
6-neighborhood V = V3(c) of c of Theorem 5.1.2 satisfies the condition of the lemma. 

Q.E.D. 


We note that the statement that x € VMA implies f(x) € U is equivalent to the 
statement that f(V N A) C U; that is, that the direct image of V N A is contained in U. Also 
from the definition of inverse image, this is the same as V NA C f7! (U). (See Definition 
1.1.7 for the definitions of direct and inverse images.) Using this observation, we now 
obtain a condition for a function to be continuous on its domain in terms of open sets. In 
more advanced courses in topology, part (b) of the next result is often taken as the definition 
of (global) continuity. 


11.3.2 Global Continuity Theorem Let A C R and let f : A — R be a function with 
domain A. Then the following are equivalent: 


(a) f is continuous at every point of A. 
(b) For every open set G in R, there exists an open set H in R such that H NA = f~! (G). 


Proof. (a) = (b). Assume that f is continuous at every point of A, and let G be a given 
open set in R. If c belongs to f~'(G), then f(c) € G, and since G is open, G is a 
neighborhood of f (c). Therefore, by the preceding lemma, it follows from the continuity of 
f that there is an open set V(c) such that x € V(c) implies that f(x) € G; that is, V(c) is 
contained in the inverse image f~! (G). Select V(c) for each c in f~! (G), and let H be the 
union of all these sets V(c). By the Open Set Properties 11.1.4, the set H is open, and we 
have H NA =f '(G). Hence (a) implies (b). 

(b) = (a). Let c be any point A, and let G be an open neighborhood of f(c). Then 
condition (b) implies that there exists an open set H in R such that H N A = f~! (G). Since 
f(c) € G, it follows that c € H, so H is a neighborhood of c. If x € HMA, then f(c) € G, 
and therefore f is continuous at c. Thus (b) implies (a). Q.E.D. 


In the case that A = R, the preceding result simplifies to some extent. 


11.3.3 Corollary A function f : R > R is continuous if and only if f~'(G) is open in 
R whenever G is open. 


It must be emphasized that the Global Continuity Theorem 11.3.2 does not say that if f 
is a continuous function, then the direct image f (G) of an open set is necessarily open. In 
general, a continuous function will not send open sets to open sets. For example, consider 
the continuous function f : R — R defined by 


f(x):=2?+1 for xeER. 
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If G is the open set G := (—1, 1), then the direct image under fis f(G) = [1, 2), which is 
not open in R. See the exercises for additional examples. 


Preservation of Compactness 


In Section 5.3 we proved that a continuous function takes a closed, bounded interval [a, b] 
onto a closed, bounded interval [m, M], where m and M are the minimum and maximum 
values of fon [a, b], respectively. By the Heine-Borel Theorem, these are compact subsets 
of R, so that Theorem 5.3.9 is a special case of the following theorem. 


11.3.4 Preservation of Compactness /f K is a compact subset of R and iff : K — Ris 
continuous on K, then f(K) is compact. 


Proof. Let G = {G,} be an open cover of the set f(K). We must show that G has a finite 
subcover. Since f(K) C UG, it follows that K C UFG). By Theorem 11.3.2, for 
each G; there is an open set H, such that H, N K = f~! (G). Then the collection {H; } is 
an open cover of the set K. Since K is compact, this open cover of K contains a finite 
subcover {H;,, H33, . . ., H}, }. Then we have 


U (a) = Um, NKDK. 
i=1 i=1 


i= 


From this it follows that L) G,, 2 f(K). Hence we have found a finite subcover of G. Since 
i=l 


G was an arbitrary open cover of f(K), we conclude that f(K) is compact. QED. 


11.3.5 Some Applications We will now show how to apply the notion of compactness 
(and the Heine-Borel Theorem) to obtain alternative proofs of some important results that 
we have proved earlier by using the Bolzano- Weierstrass Theorem. In fact, these theorems 
remain true if the intervals are replaced by arbitrary nonempty compact sets in R. 


(1) The Boundedness Theorem 5.3.2 is an immediate consequence of Theorem 11.3.4 and 
the Heine-Borel Theorem 11.2.5. Indeed, if K C R is compact and if f : K — R is 
continuous on K, then f(K) is compact and hence bounded. 

(2) The Maximum-Minimum Theorem 5.3.4 is also an easy consequence of Theorem 
11.3.4 and the Heine-Borel Theorem. As before, we find that f(K) is compact and hence 
bounded in R, so that s* := sup f(K) exists. If f(K) is a finite set, then s* € f(K). Iff (K) is 
an infinite set, then s* is a cluster point of f (K) [see Exercise 11.2.6]. Since f (K) is a closed 
set, by the Heine-Borel Theorem, it follows from Theorem 11.1.8 that s* € f(K). We 
conclude that s* = f(x*) for some x* € K. 

(3) We can also give a proof of the Uniform Continuity Theorem 5.4.3 based on the notion 
of compactness. To do so, let K C R be compact and let f : K — R be continuous on K. 
Then given ¢ > 0 and u € K, there is a number 6, := S(4e, u) > 0 such that if x € K and 
|x — u| < 5, then | f(x) — f(u)| < 4e. For each u € K, let G, := (u — $ ôu, u + £ ôu) so that 
Gu is open; we consider the collection G = {G, : u € K}. Since u € G, for u € K, it is 


trivial that K C U G,. Since K is compact, there are a finite number of sets, say 
ucK 


Gui» -- -, Guy, Whose union contains K. We now define 


Sle) := Finf{d,,,..., du}, 
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so that 5(¢) > 0. Now if x, u € K and |x — u| < 6(e), then there exists some up with 
k =1,...,M such that x € G,,; therefore |x — up| < 5 Sug: Since we have (e) < 5 Sigs it 
follows that 


ju — me] < [u — x| + |x — uel < 8m 
But since 6,, = S(4e, Uk) it follows that both 


If(x) —f(ue)|<ze and | f(u) — f(ue)| < 38- 


Therefore we have | f(x) —f(u)| < e. 

We have shown that if ¢ > 0, then there exists 6(¢) > 0 such that if x, u are any points 
in K with |x — u| < 6(e), then | f(x) — f(u)| < e. Since € > 0 is arbitrary, this shows that f 
is uniformly continuous on K, as asserted. 


We conclude this section by extending the Continuous Inverse Theorem 5.6.5 to 
functions whose domains are compact subsets of R, rather than intervals in R. 


11.3.6 Theorem Jf K is a compact subset of R and f :K —R is injective and 
continuous, then f~! is continuous on f(K). 


Proof. Since K is compact, then Theorem 11.3.4 implies that the image f(K) is compact. 
Since fis injective by hypothesis, the inverse function f~! is defined on f (K) to K. Let (y,) 
be any convergent sequence in f(K), and let yọ = lim(y,,). To establish the continuity of 
f—', we will show that the sequence (f~'(y,)) converges to f~! (yọ). 

Let x, :=f~'(y,,) and, by way of contradiction, assume that (x,,) does not converge to 
xo := f7! (yo). Then there exists an ¢ > 0 and a subsequence (x4) such that |x} — xo| > e 
for all k. Since K is compact, we conclude from Theorem 11.2.6 that there is a subsequence 
(xi) of the sequence (x/,) that converges to a point x* of K. Since |x* — xo| > e, we have 
x* Æ Xo. Now since f is continuous, we have lim( f (x! )) = f(x*). Also, since the 
subsequence (y) of (y,) that corresponds to the subsequence (x) of (xn) must converge 
to the same limit as (y,,) does, we have 


lim(f(x;)) = lim(y,) = yo = f (x0). 


Therefore we conclude that f(x*) = f(xo). However, since f is injective, this implies that 
x* = xo, which is a contradiction. Thus we conclude that f~! takes convergent sequences 
in f(K) to convergent sequences in K, and hence f~! is continuous. Q.E.D. 


Exercises for Section 11.3 


1. Let f : R — R be defined by f(x) = x? for x € R. 
(a) Show that the inverse image f~! (T) of an open interval 7 := (a, b) is either an open interval, 
the union of two open intervals, or empty, depending on a and b. 
(b) Show that if Z is an open interval containing 0, then the direct image f (I) is not open. 


2. Let f:IR — R be defined by f(x) := 1/(1 + x?) for x € R. 
(a) Find an open interval (a, b) whose direct image under f is not open. 
(b) Show that the direct image of the closed interval [0, 00) is not closed. 


3. Let J := [1,00) and let f(x) := Vx — 1 for x € I. For each é-neighborhood G = (—¢, +e) of 0, 
exhibit an open set H such that H N I = f~! (G). 
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4. Leth: R-— R be defined by A(x) := 1 if 0 < x < 1, h(x) := 0 otherwise. Find an open set G 
such that h~'(G) is not open, and a closed set F such that h~'(F) is not closed. 


5. Show that if f : R — R is continuous, then the set {x € R : f(x) < a} is open in R for each 
a ER. 


6. Show that if f : R — R is continuous, then the set {x € R: f(x) < æ} is closed in R for each 
a ER. 


7. Show that if f : R — R is continuous, then the set {x € R: f(x) = k} is closed in R for each 
keR. 


8. Give an example of a function f : R — R such that the set{x € R : f(x) = 1} is neither open 
nor closed in R. 


9. Prove that f : R — R is continuous if and only if for each closed set F in R, the inverse image 
f | (F) is closed. 


10. Let J := [a, b] and let f : I — R and g : I — R be continuous functions on Z. Show that the 
set {x € I : f(x) = g(x)} is closed in R. 


Section 11.4 Metric Spaces 


This book has been devoted to a careful study of the real number system and a number of 
different limiting processes that can be defined for functions of a real variable. A central 
topic was the study of continuous functions. At this point, with a strong understanding of 
analysis on the real line, the study of more general spaces and the related limit concepts can 
begin. It is possible to generalize the fundamental concepts of real analysis in several 
different ways, but one of the most fruitful is in the context of metric spaces, where a metric 
is an abstraction of a distance function. 

In this section, we will introduce the idea of metric space and then indicate how certain 
areas of the theory developed in this book can be extended to this new setting. We will 
discuss the concepts of neighborhood of a point, open and closed sets, convergence of 
sequences, and continuity of functions defined on metric spaces. Our purpose in this brief 
discussion is not to develop the theory of metric spaces to any great extent, but to reveal 
how the key ideas and techniques of real analysis can be put into a more abstract and 
general framework. The reader should note how the basic results of analysis on the real line 
serve to motivate and guide the study of analysis in more general contexts. 

Generalization can serve two important purposes. One purpose is that theorems 
derived in general settings can often be applied in many particular cases without the need of 
a separate proof for each special case. A second purpose is that by removing the 
nonessential (and sometimes distracting) features of special situations, it is often possible 
to understand the real significance of a concept or theorem. 


Metrics 


On the real line, basic limit concepts were defined in terms of the distance |x — y| 
between two points x, y in R, and many theorems were proved using the absolute value 
function. Actually, a careful study reveals that only a few key properties of the absolute 
value were required to prove many fundamental results, and it happens that these 
properties can be extracted and used to define more general distance functions called 
“metrics.” 
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11.4.1 Definition A metric on a set S is a function d : S x S — R that satisfies the 
following properties: 

(a) d(x, y) > Oforall x, y € S (positivity); 

(b) d(x,y) = Oif and only if x = y (definiteness); 

(© d(x, y) = d(y, x) forall x, y € S (symmetry); 

(d) d(x, y) < d(x, z) + d(z,y) forall x, y, z € S (triangle inequality). 


A metric space (S,d) is a set S together with a metric d on S. 
We consider several examples of metric spaces. 


11.4.2 Examples (a) The familiar metric on R is defined by 
d(x, y):=|x—y| for x,y ER. 
Property 11.4.1(d) for d follows from the Triangle Inequality for absolute value because 
we have 
d(x,y) = |x — y| = |(x — z) + (z — y)| 
< |x Ti z]| + |z — y| = d(x,z) T d(z,y), 
for all x,y,z € R. 


(b) The distance function in the plane obtained from the Pythagorean Theorem provides 
one example of a metric in R°. That is, we define the metric d on R° as follows: if 
P; := (x1,y1) and Pz := (x2,y2) are points in R?, then 


d(P1, Po) = (x1 — x2)" + (1 — 92). 


(c) It is possible to define several different metrics on the same set. On R’, we can also 
define the metric d4 as follows: 


dy (P1,P2) := |x1 — x2| + [yı — yal. 
Still another metric on R° is dæ defined by 
d.(P1,P2) := sup {|x1 — x2]; [v1 — y2l}- 


The verifications that dı and d% satisfy the properties of a metric are left as exercises. 


(d) Let C[0, 1] denote the set of all continuous functions on the interval [0, 1] to R. For f, g 
in C[0, 1], we define 


doo(f, 8) := sup {|f(x) — g(x)| : x € [0, 1]}. 


Then it can be verified that da is a metric on C[0, 1]. This metric is the uniform norm of 
f — g on [0, 1] as defined in Section 8.1; that is, da (f, g) = || — g||, where || f|| denotes 
the uniform norm of f on the set [0, 1]. 


(e) We again consider C[0, 1], but we now define a different metric dı by 


1 
AED =] lf—sl for fg € cjo, I]. 


The properties of the integral can be used to show that this is indeed a metric on C(0, 1]. The 
details are left as an exercise. 
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(f) Let S be any nonempty set. For s,¢ € S, we define 


0 if s=t, 
aoa { 1 if sAt 


It is an exercise to show that d is a metric on S. This metric is called the discrete metric 
on the set S. 


We note that if (S, d) is a metric space, and if T C S, then d’ defined by d'(x,y) := 
d(x,y) for all x,y € T gives a metric on T, which we generally denote by d. With this 
understanding, we say that (T, d) is also a metric space. For example, the metric d on R 
defined by the absolute value is a metric on the set Q of rational numbers, and thus (Q, d) is 
also a metric space. 


Neighborhoods and Convergence 


The basic notion needed for the introduction of limit concepts is that of neighborhood, and 
this is defined in metric spaces as follows. 


11.4.3 Definition Let (S, d) be a metric space. Then for € > 0, the -neighborhood of a 
point xo in S is the set 


V.(xo0) := {x € S: d(x0,x) < €}. 


A neighborhood of xo is any set U that contains an -neighborhood of xo for some e > 0. 


Any notion defined in terms of neighborhoods can now be defined and discussed in the 
context of metric spaces by modifying the language appropriately. We first consider the 
convergence of sequences. 

A sequence in a metric space (S, d) is a function X : N — S with domain N and range 
in S, and the usual notations for sequence are used; we write X = (xn), but now x, € S for 
all n € N. When we replace the absolute value by a metric in the definition of sequential 
convergence, we get the notion of convergence in a metric space. 


11.4.4 Definition Let (x,,) be a sequence in the metric space (S, d). The sequence (xn) is 
said to converge to x in S if for any ¢ > 0 there exists K € N such that x, € V,(x) for all 
n> kK. 


Note that since x, € V,(x) if and only if d(x,, x) < €, a sequence (x,) converges to x 
if and only if for any € > 0 there exists K such that d(x,,x) < € for all n > K. In other 
words, a sequence (xn) in (S, d) converges to x if and only if the sequence of real numbers 
(d(Xn,x)) converges to 0. 


11.4.5 Examples (a) Consider R°? with the metric d defined in Example 11.4.2(b). If 
Pa = (Xn, Yn) € R? for each n € N, then we claim that the sequence (P,,) converges to 
P = (x,y) with respect to this metric if and only if the sequences of real numbers (x,,) and 
(Yn) converge to x and y, respectively. 

First, we note that the inequality |x, — x| < d(P,, P) implies that if (P,,) converges to 
P with respect to the metric d, then the sequence (x,) converges to x; the convergence of 
n) follows in a similar way. The converse follows from the inequality d(P,,P) < 
|x» — x| + |y, — y|, which is readily verified. The details are left to the reader. 
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(b) Let d, be the metric on C[0, 1] defined in Example 11.4.2(d). Then a sequence (f,,) 
in C[0, 1] converges to f with respect to this metric if and only if (f,,) converges to f 
uniformly on the set [0, 1], This is established in Lemma 8.1.8 in the discussion of the 
uniform norm. 


Cauchy Sequences 


The notion of Cauchy sequence is a significant concept in metric spaces. The definition is 
formulated as expected, with the metric replacing the absolute value. 


11.4.6 Definition Let (S, d) be a metric space. A sequence (x,) in S is said to be a 
Cauchy sequence if for each ¢ > 0, there exists H € N such that d(xXn, Xm) < € for all 
n,m>H. 


The Cauchy Convergence Theorem 3.5.5 for sequences in R states that a sequence in 
Ris a Cauchy sequence if and only if it converges to a point of R. This theorem is not true 
for metric spaces in general, as the examples that follow will reveal. Those metric spaces 
for which Cauchy sequences are convergent have special importance. 


11.4.7 Definition A metric space (S, d) is said to be complete if each Cauchy sequence 
in S converges to a point of S. 


In Section 2.3 the Completeness Property of R is stated in terms of the order properties 
by requiring that every nonempty subset of R that is bounded above has a supremum in R. 
The convergence of Cauchy sequences is deduced as a theorem. In fact, it is possible to 
reverse the roles of these fundamental properties of R: the Completeness Property of R can 
be stated in terms of Cauchy sequences as in 11.4.7, and the Supremum Property can then 
be deduced as a theorem. Since many metric spaces do not have an appropriate order 
structure, a concept of completeness must be described in terms of the metric, and Cauchy 
sequences provide the natural vehicle for this. 


11.4.8 Examples (a) The metric space (Q, d) of rational numbers with the metric 
defined by the absolute value function is not complete. 

For example, if (x„) is a sequence of rational numbers that converges to v2, then it is 
Cauchy in Q, but it does not converge to a point of Q. Therefore (Q, d) is not a complete 
metric space. 

(b) The space C[0, 1] with the metric dẹ defined in 11.4.2(d) is complete. 

To prove this, suppose that ( f„) is a Cauchy sequence in C[0, 1] with respect to the 

metric dæ. Then, given ¢ > 0, there exists H such that 


(1) Sa) = Fn) < E 

for all x € [0, 1] and all n,m > H. Thus for each x, the sequence (f,,(x)) is Cauchy in R, 
and therefore converges in R. We define f to be the pointwise limit of the sequence; that is, 
f(x) := lim(f,,(x)) for each x € [0, 1]. It follows from (1) that for each x € [0, 1] and each 
n > H, we have |f, (x) — f (x)| < £. Consequently the sequence (f,,) converges uniformly 
to fon [0, 1]. Since the uniform limit of continuous functions is also continuous (by 8.2.2), 
the function f is in C[0, 1]. Therefore the metric space (C[0, 1], dæ) is complete. 

(c) Ifd; is the metric on C[0, 1] defined in 11.4.2(e), then the metric space (C[0, 1], d1) is 
not complete. 
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To prove this statement, it suffices to exhibit a Cauchy sequence that does not have a 
limit in the space. We define the sequence (fp) for n > 3 as follows (see Figure 11.4.1): 


1 for O<x< 1/2, 
falx) = 4 1 +n/2 —nx for 1/2<x<1/2+4+1/n, 
0 for 1/24+1/n<x<1. 


Note that the sequence (fọ) converges pointwise to the discontinuous function f(x) := 1 for 
0 < x< 1/2 and f(x) := 0 for 1/2 <x < 1. Hence f ¢ C[0, 1]; in fact, there is no 
function g € C[0, 1] such that di(f,, g) — 0. 


1 1 1 
Dtr 


Figure 11.4.1 The sequence (fp) 


Open Sets and Continuity 


With the notion of neighborhood defined, the definitions of open set and closed set read the 
same as for sets in R. 


11.4.9 Definition Let (S, d) be a metric space. A subset G of S is said to be an open set in 
Sif for every point x € S there is a neighborhood U of x such that U C G. A subset F of S is 
said to be a closed set in S if the complement S\F is an open set in S. 


Theorems 11.1.4 and 11.1.5 concerning the unions and intersections of open sets and 
closed sets can be extended to metric spaces without difficulty. In fact, the proofs of those 
theorems carry over to metric spaces with very little change: simply replace the 
é-neighborhoods (x — £, x + £) in R by ¢-neighborhoods V,(x) in S. 

We now can examine the concept of continuity for functions that map one metric space 
(Sı, dı) into another metric space (S2, d2). Note that we modify the property in 5.1.2 of 
continuity for functions on R by replacing neighborhoods in R by neighborhoods in the 
metric spaces. 


11.4.10 Definition Let (Sı, dı) and (S2, d2) be metric spaces, and let f : Sı — S2 be a 
function from Sı to S2. The function f is said to be continuous at the point c in S, if for 
every é-neighborhood V,(f(c)) of f(c) there exists a 6-neighborhood V;(c) of c such that if 
x € V5(c), then f(x) € Ve(f(c)). 


The ¢-6 formulation of continuity can be stated as follows: f : Sı — S2 is continuous at 
c if and only if for each ¢ > 0 there exists 6 > 0 such that d)(x,c) < ô implies that 


da( f(x), fl) < e. 
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The Global Continuity Theorem can be established for metric spaces by appropriately 
modifying the argument for functions on R. 


11.4.11 Global Continuity Theorem /f (S,, dı) and (S2, dz) are metric spaces, then a 
function f : Sı — S2 is continuous on S; if and only if f~'(G) is open in S, whenever G is 
open in Sp. 


The notion of compactness extends immediately to metric spaces. A metric space 
(S, d) is said to be compact if each open cover of S has a finite subcover. Then by 
modifying the proof of 11.3.4, we obtain the following result. 


11.4.12 Preservation of Compactness /f (S, d) is a compact metric space and if the 
function f : S — R is continuous, then f(S) is compact in R. 


The important properties of continuous functions given in 11.3.5 then follow immedi- 
ately. The Boundedness Theorem, the Maximum-Minimum Theorem, and the Uniform 
Continuity Theorem for real-valued continuous functions on a compact metric space are all 
established by appropriately modifying the language of the proofs given in 11.3.5. 


Semimetrics 


11.4.13 Definition A semimetric on a set S is a function d : S x S — R that satisfies all 
of the conditions in Definition 11.4.1, except that condition (b) is replaced by the weaker 
condition 


(b’) d(x,y) =0 if x=y. 


A semimetric space (S, d) is a set S together with a semimetric d on S. 


Thus every metric is a semimetric, and every metric space is a semimetric space. 
However, the converse is not true. For example, if P; := (x1, yı) and P2 := (x2, y2) are 
points in the space R°, the function d, defined by 


dy (Pi, P2) := |x, — x2], 


is easily seen to be a semimetric, but it is not a metric since any two points with the same 
first coordinate have “‘d-distance’’ equal to 0. 

Somewhat more interestingly, if f, g are any functions in L[a, b], we have defined (in 
Definition 10.2.9) the distance function: 


b 
CUE EE f Bati 


a 


Here it is clear that any two functions that are equal except at a countable set of points will 
have distance equal to 0 from each other (in fact, this is also true when the functions are 
equal almost everywhere). 

The reader can retrace the discussion in the present section and see that most of what 
we have done remains true for semimetrics and semimetric spaces. The main difference is 
that a sequence in a semimetric space does not necessarily converge to a unique limit. 
While this seems to be rather unusual, it is actually not a very serious problem and one can 
learn to adjust to this situation. The other alternative is to “identify” points that have 
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distance 0 from each other. This identification procedure is often invoked, but it means one 
is dealing with “equivalence classes” rather than individual points. Often this cure is worse 
than the malady. 


Exercises for Section 11.4 


Boe ie 


oy 


11. 
12. 


Show that the functions dı and dæ defined in 11.4.2(c) are metrics on R°. 
Show that the functions d,, and d, defined in 11.4.2(d, e) are metrics on C[0, 1]. 
Verify that the discrete metric on a set S as defined in 11.4.2(f) is a metric. 


If Pn := (Xn, Yn) € R? and dx is the metric in 11.4.2(c), show that (P,) converges to P := (x, y) 
with respect to this metric if and only if (x,) and (y,,) converge to x and y, respectively. 


Verify the conclusion of Exercise 4 if dẹ is replaced by dı. 


Let S be a nonempty set and let d be the discrete metric defined in 11.4.2(f). Show that in the 
metric space (S, d), a sequence (x,) in S converges to x if and only if there is a K € N such that 
Xn = x for all n > K. 


Show that if d is the discrete metric on a set S, then every subset of S is both open and closed 
in (S, d). 


Let P := (x,y) and O := (0,0) in R?. Draw the following sets in the plane: 
) [PER : A : P) < 
P)<1 


a [PER : 7 

Prove that in any metric space, an -neighborhood of a point is an open set. 
Prove Theorem 11.4.11. 

Prove Theorem 11.4.12. 


If (S, d) is a metric space, a subset A C S is said to be bounded if there exists x9 € S and a 
number B > 0 such that A C {x € S : d(x, x9) < B}. Show that if A is a compact subset of S, 
then A is closed and bounded. 


APPENDIX A 


LOGIC AND PROOFS 


Natural science is concerned with collecting facts and organizing these facts into a 
coherent body of knowledge so that one can understand nature. Originally much of 
science was concerned with observation, the collection of information, and its classifica- 
tion. This classification gradually led to the formation of various “theories” that helped the 
investigators to remember the individual facts and to be able to explain and sometimes 
predict natural phenomena. The ultimate aim of most scientists is to be able to organize 
their science into a coherent collection of general principles and theories so that these prin- 
ciples will enable them both to understand nature and to make predictions of the outcome 
of future experiments. Thus they want to be able to develop a system of general principles 
(or axioms) for their science that will enable them to deduce the individual facts and 
consequences from these general laws. 

Mathematics is different from the other sciences; by its very nature, it is a deductive 
science. That is not to say that mathematicians do not collect facts and make observations 
concerning their investigations. In fact, many mathematicians spend a large amount of time 
performing calculations of special instances of the phenomena they are studying in the 
hopes that they will discover “unifying principles.” (The great Gauss did a vast amount of 
calculation and studied much numerical data before he was able to formulate a conjecture 
concerning the distribution of prime numbers.) However, even after these principles and 
conjectures are formulated, the work is far from over, for mathematicians are not satisfied 
until conjectures have been derived (i.e., proved) from the axioms of mathematics, from the 
definitions of the terms, and from results (or theorems) that have previously been proved. 
Thus, a mathematical statement is not a theorem until it has been carefully derived from 
axioms, definitions, and previously proved theorems. 

A few words about the axioms (i.e., postulates, assumptions, etc.) of mathematics are 
in order. There are a few axioms that apply to all of mathematics—the ‘‘axioms of set 
theory’’—and there are specific axioms within different areas of mathematics. Sometimes 
these axioms are stated formally, and sometimes they are built into definitions. For 
example, we list properties in Chapter 2 that we assume the real number system possesses; 
they are really a set of axioms. As another example, the definition of a “group” in abstract 
algebra is basically a set of axioms that we assume a set of elements to possess, and the 
study of group theory is an investigation of the consequences of these axioms. 

Students studying real analysis for the first time usually do not have much experience 
in understanding (not to mention constructing) proofs. In fact, one of the main purposes of 
this course (and this book) is to help the reader gain experience in the type of critical 
thought that is used in this deductive process. The purpose of this appendix is to help the 
reader gain insight about the techniques of proof. 


Statements and Their Combinations 


All mathematical proofs and arguments are based on statements, which are declarative 
sentences or meaningful strings of symbols that can be classified as being true or false. It is 
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not necessary that we know whether a given statement is actually true or false, but it must 
be one or the other, and it cannot be both. (This is the Principle of the Excluded Middle.) 
For example, the sentence “Chickens are pretty” is a matter of opinion and not a statement 
in the sense of logic. Consider the following sentences: 


e It rained in Kuala Lumpur on June 2, 1988. 

e Thomas Jefferson was shorter than John Adams. 
e There are infinitely many twin primes. 

e This sentence is false. 


The first three are statements: the first is true, the second is false, and the third is either true 
or false, but we are not sure which at this time. The fourth sentence is not a statement; it can 
be neither true nor false since it leads to contradictory conclusions. 

Some statements (such as “1 + 1 = 2”) are always true; they are called tautologies. 
Some statements (such as “2 = 3”) are always false; they are called contradictions or 
falsities. Some statements (such as “x? = 1”) are sometimes true and sometimes false 
(e.g., true when x = 1 and false when x = 3). Or course, for the statement to be completely 
clear, it is necessary that the proper context has been established and the meaning of the 
symbols has been properly defined (e.g., we need to know that we are referring to integer 
arithmetic in the preceding examples). 

Two statements P and Q are said to be logically equivalent if P is true exactly when Q 
is true (and hence P is false exactly when Q is false). In this case we often write P = Q. For 
example, we write 


(xis Abraham Lincoln) = (x is the 16th president of the United States). 


There are several different ways of forming new statements from given ones by using 
logical connectives. 
If P is a statement, then its negation is the statement denoted by 


not P, 


which is true when P is false, and is false when P is true. (A common notation for the 
negation of P is — P.) A little thought shows that 


P = not(not P). 


This is the Principle of Double Negation. 
If P and Q are statements, then their conjunction is the statement denoted by 


P and Q, 


which is true when both P and Q are true, and is false otherwise. (A standard notation for 
the conjunction of P and Q is P A Q.) It is evident that 


(P and Q) = (Q and P). 
Similarly, the disjunction of P and Q is the statement denoted by 
PorQ 


which is true when at least one of P and Q is true, and false only when they are both false. In 
legal documents “or” is often denoted by “and/or” to make it clear that this disjunction is 
also true when both P and Q are true. (A standard notation for the disjunction of P and Q is 
P v Q.) It is also evident that 


(P or Q) = (QorP). 
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To contrast disjunctive and conjunctive statements, note that the statement “2 < /2 and 
V2 < 3” is false, but the statement “2 < V2 or V2 < 3” is true (since V2 is approxi- 
mately equal to 1.4142... ). 

Some thought shows that negation, conjunction, and disjunction are related by 
De Morgan’s Laws: 


not (P and Q) = (not P) or (not Q), 
not (P or Q) = (not P) and (not Q). 


The first of these equivalencies can be illustrated by considering the statements 


>, Q:yEA. 


The statement (P and Q) is true when both (x = 2) and (y € A) are true, and it is false when 
at least one of (x = 2) and (y € A) is false; that is, the statement not(P and Q) is true when at 
least one of the statements (x 4 2) and (y ¢ A) holds. 


P:x=2 


Implications 


A very important way of forming a new statement from given ones is the implication (or 
conditional) statement, denoted by 


(P => Q), (if P then Q), or (P implies Q). 


Here P is called the hypothesis, and Q is called the conclusion of the implication. To help 
understand the truth values of the implication, consider the statement 


If I win the lottery today, then P11 buy Sam a car. 


Clearly this statement is false if I win the lottery and don’t buy Sam a car. What if I don’t 
win the lottery today? Under this circumstance, I haven’t made any promise about buying 
anyone a car, and since the condition of winning the lottery did not materialize, my failing 
to buy Sam a car should not be considered as breaking a promise. Thus the implication is 
regarded as true when the hypothesis is not satisfied. 

In mathematical arguments, we are very much interested in implications when the 
hypothesis is true, but not much interested in them when the hypothesis is false. The 
accepted procedure is to take the statement P = Q to be false only when P is true and Q is 
false; in all other cases the statement P = Q is true. (Consequently, if P is false, then we 
agree to take the statement P = Q to be true whether or not Q is true or false. That may 
seem strange to the reader, but it turns out to be convenient in practice and consistent with 
the other rules of logic.) 

We observe that the definition of P > Q is logically equivalent to 


not (P and (not Q)), 


because this statement is false only when P is true and Q is false, and it is true in all other 
cases. It also follows from the first De Morgan Law and the Principle of Double Negation 
that P => Q is logically equivalent to the statement 


(not P) or Q, 


since this statement is true unless both (not P) and Q are false; that is, unless P is true and Q 
is false. 
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Contrapositive and Converse 


As an exercise, the reader should show that the implication P > Q is logically equivalent to 
the implication 


(not Q) = (not P), 


which is called the contrapositive of the implication P => Q. For example, if P = Q is the 
implication 


If I am in Chicago, then I am in Illinois, 
then the contrapositive (not Q) = (not P) is the implication 
If I am not in Illinois, then I am not in Chicago. 


The equivalence of these two statements is apparent after a bit of thought. In attempting to 
establish an implication, it is sometimes easier to establish the contrapositive, which is 
logically equivalent to it. (This will be discussed in more detail later.) 

If an implication P = Q is given, then one can also form the statement 


Q =P, 


which is called the converse of P = Q. The reader must guard against confusing the 
converse of an implication with its contrapositive, since they are quite different statements. 
While the contrapositive is logically equivalent to the given implication, the converse is 
not. For example, the converse of the statement 


If I am in Chicago, then I am in Illinois, 
is the statement 
If I am in Illinois, then I am in Chicago. 


Since it is possible to be in Illinois but not in Chicago, these two statements are evidently 
not logically equivalent. 

There is one final way of forming statements that we will mention. It is the double 
implication (or the biconditional) statement, which is denoted by 


P4>Q or Pif and only if Q, 


and which is defined by 
(P = Q) and (Q = P). 


It is a straightforward exercise to show that P <=> Q is true precisely when P and Q are 
both true, or both false. 


Context and Quantifiers 


In any form of communication, it is important that the individuals have an appropriate 
context in mind. Statements such as “I saw Mary today”? may not be particularly 
informative if the hearer knows several persons named Mary. Similarly, if one goes 
into the middle of a mathematical lecture and sees the equation x? = 1 on the blackboard, it 
is useful for the viewer to know what the writer means by the letter x and the symbol 1. Is x 
an integer? A function? A matrix? A subgroup of a given group? Does 1 denote a natural 
number? The identity function? The identity matrix? The trivial subgroup of a group? 
Often the context is well understood by the conversants, but it is always a good idea to 
establish it at the start of a discussion. For example, many mathematical statements involve 
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one or more variables whose values usually affect the truth or the falsity of the statement, so 
we should always make clear what the possible values of the variables are. 

Very often mathematical statements involve expressions such as “‘for all,” “for every,” 
“for some,” “there exists,” “there are,” and so on. For example, we may have the statements 


99 66 


For any integer x, v=] 
and 


There exists an integer x such that x? = 1. 


Clearly the first statement is false, as is seen by taking x = 3; however, the second statement 
is true since we can take either x = 1 or x = —1. 

If the context has been established that we are talking about integers, then the above 
statements can safely be abbreviated as 


For any x, x? = 1 
and 


There exists an x such that x? = 1. 


The first statement involves the universal quantifier “for every,’ and is making a 
statement (here false) about all integers. The second statement involves the existential 
quantifier “there exists,” and is making a statement (here true) about at least one integer. 

These two quantifiers occur so often that mathematicians often use the symbol V to 
stand for the universal quantifier, and the symbol 4 to stand for the existential quantifier. 
That is, 


VY denotes ‘‘for every,” 
4 denotes ‘‘there exists.’’ 


While we do not use these symbols in this book, it is important for the reader to know how 
to read formulas in which they appear. For example, the statement 


(i) (Vx)(dy)(x +y = 0) 


(understood for integers) can be read 


For every integer x, there exists 
an integer y such that x + y = 0. 


Similarly, the statement 
(ii) (Ay)(Vx)(x +y = 0) 


can be read 


There exists an integer y, such that 
for every integer x, then x + y= 0. 


These two statements are very different; for example, the first one is true and the second 
one is false. The moral is that the order of the appearance of the two different types of 
quantifiers is very important. It must also be stressed that if several variables appear in a 
mathematical expression with quantifiers, the values of the later variables should be 
assumed to depend on all of the values of the variables that are mentioned earlier. Thus in 
the (true) statement (i) above, the value of y depends on that of x; here if x = 2, then y = —2, 
while if x = 3, then y = —3. 
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It is important that the reader understand how to negate a statement that involves 
quantifiers. In principle, the method is simple. 


(a) To show that it is false that every element x in some set possesses a certain property P, 
it is enough to produce a single counter-example (that is, a particular element in the 
set that does not possess this property); and 

(b) To show that it is false that there exists an element y in some set that satisfies a certain 
property P, we need to show that every element y in the set fails to have that property. 


Therefore, in the process of forming a negation, 


not (Vx)P becomes (4x) not P 


and similarly, 


not (Jy)P becomes (Vy) not P. 


When several quantifiers are involved, these changes are repeatedly used. Thus the nega- 
tion of the (true) statement (i) given previously becomes in succession 


not (Y x)(Sy)(x + y = 0), 
(a x) not (A y)(x + y = 0), 
(4 x)(Vy) not (x + y = 0), 
(Ax)(Vy)(x+y 40). 


The last statement can be rendered in words as: 


There exists an integer x, such that 
for every integer y, then x + y #0. 


(This statement is, of course, false.) 
Similarly, the negation of the (false) statement (ii) given previously becomes in 
succession 


not (J y)(¥ x)(x + y = 0), 
not (Y x)(x +y = 0), 
4x) not (x + y = 0), 
Jx)(x+y #0). 


The last statement is rendered in words as 


For every integer y, there exists 
an integer x such that x + y £0. 


Note that this statement is true, and that the value (or values) of x that make x + y #0 
depends on y, in general. 
Similarly, the statement 


For every ô > 0, the interval (—6, ô) 
contains a point belonging to the set A, 


can be seen to have the negation 


There exists 6 > O such that the interval 
(—6, 5) does not contain any point in A. 
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The first statement can be symbolized 


(vô > O)(Ay € A) € (=ô, ô)), 
and its negation can be symbolized by 


(38 > 0)(Vy € A) € (~8, 8)), 


or by 
(46 > O)(AN (—6, 5) = 0). 


It is the strong opinion of the authors that, while the use of this type of symbolism is 
often convenient, it is not a substitute for thought. Indeed, the readers should ordinarily 
reason for themselves what the negation of a statement is and not rely slavishly on 
symbolism. While good notation and symbolism can often be a useful aid to thought, it can 
never be an adequate replacement for thought and understanding. 


Direct Proofs 


Let P and Q be statements. The assertion that the hypothesis P of the implication P > Q 
implies the conclusion Q (or that P = Q is a theorem) is the assertion that whenever the 
hypothesis P is true, then Q is true. 

The construction of a direct proof of P = Q involves the construction of a string of 
statements R,, Ro, ..., R, such that 


P> R, Ri >R, ..., Rr> Q. 


(The Law of the Syllogism states that if Rı = Ry and R, = R3 are true, then R; => R3 is true.) 
This construction is usually not an easy task; it may take insight, intuition, and considerable 
effort. Often it also requires experience and luck. 

In constructing a direct proof, one often works forward from P and backward from Q. 
We are interested in logical consequences of P; that is, statements Q),---,Q,; such that 
P = Q;. And we might also examine statements P;,..., P, such that P; = Q. If we can 
work forward from P and backward from Q so the string “connects” somewhere in the 
middle, then we have a proof. Often in the process of trying to establish P = Q one finds 
that one must strengthen the hypothesis (i.e., add assumptions to P) or weaken the 
conclusion (that is, replace Q by a nonequivalent consequence of Q). 

Most students are familiar with “‘direct’’ proofs of the type described above, but we 
will give one elementary example here. Let us prove the following theorem. 


A.1 Theorem The square of an odd integer is also an odd integer. 


If we let stand for an integer, then the hypothesis is: 
P : nis an odd integer. 
The conclusion of the theorem is: 
Q : n? is an odd integer. 
We need the definition of odd integer, so we introduce the statement 


R, : n = 2k — 1 for some integer k. 
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Then we have P => R4. We want to deduce the statement n? = 2m — 1 for some integer m, 
since this would imply Q. We can obtain this statement by using algebra: 


Ry in? = (2k —1)° = 4k? — 4k +1, 
R: = (4k? — 4k +2) -1, 
Rain? = 2(2k? —2k +1) -1. 


If we let m = 2k? — 2k +1, then m is an integer (why?), and we have deduced the 
statement 


Rs: n? =2m—1. 


Thus we have P > R; > R => R; > R4 => Rs Q, and the theorem is proved. 

Of course, this is a clumsy way to present a proof. Normally, the formal logic is 
suppressed and the argument is given in a more conversational style with complete English 
sentences. We can rewrite the preceding proof as follows. 


Proof of A.1 Theorem. If n is an odd integer, then n = 2k — 1 for some integer k. 
Then the square of n is given by n? = 4k? — 4k + 1 = 2(2k? — 2k + 1) — 1. If we let 
m = 2k? — 2k + 1, then m is an integer (why?) and n” = 2m — 1. Therefore, n” is an odd 
integer. Q.E.D. 


At this stage, we see that we may want to make a preliminary argument to prove that 
2k? — 2k + 1 is an integer whenever k is an integer. In this case, we could state and prove 
this fact as a Lemma, which is ordinarily a preliminary result that is needed to prove a 
theorem, but has little interest by itself. 

Incidentally, the letters Q.E.D. stand for quod erat demonstrandum, which is Latin for 
“which was to be demonstrated.” 


Indirect Proofs 


There are basically two types of indirect proofs: (i) contrapositive proofs, and (ii) proofs by 
contradiction. Both types start with the assumption that the conclusion Q is false, in other 
words, that the statement “not Q” is true. 


(i) | Contrapositive proofs. Instead of proving P = Q, we may prove its logically 
equivalent contrapositive: not Q = not P. 


Consider the following theorem. 
A.2 Theorem Jf n is an integer and n? is even, then n is even. 


The negation of “Q : n is even” is the statement “not Q : n is odd.” The hypothesis 
“P : n? is even” has a similar negation, so that the contrapositive is the implication: If n is 
odd, then n? is odd. But this is exactly Theorem A.1, which was proved above. Therefore 
this provides a proof of Theorem A.2. 

The contrapositive proof is often convenient when the universal quantifier is involved, 
for the contrapositive form will then involve the existential quantifier. The following 
theorem is an example of this situation. 


A.3 Theorem Let a> 0 be areal number. If, for every € > 0, we have 0 < a < e, thena =0. 
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Proof. If a =O is false, then since a > 0, we must have a > 0. In this case, if we choose 
E0 = 5a, then we have é > 0 and é < a, so that the hypothesis 0 < a < e for all e > 0 is 
false. Q.E.D. 


Here is one more example of a contrapositive proof. 


A.4 Theorem /f 7m, n are natural numbers such that m + n > 20, then either m > 10 or 
n > 10. 


Proof. If the conclusion is false, then we have both m < 10 and n < 10. (Recall 
De Morgan’s Law.) Then addition gives us m + n < 10 + 10 = 20, so that the hypothesis 
is false. Q.E.D. 


Proof by contradiction. This method of proof employs the fact that if C is a contradiction 
(i.e., a statement that is always false, such as “1 = 0”), then the two statements 


(P and (not Q)) > C, P> Q 


are logically equivalent. Thus we establish P = Q by showing that the statement (P and 
(not Q)) implies a contradiction. Q.E.D. 


A.5 Theorem Let a > 0 be a real number. If a > 0, then 1/a > 0. 


Proof. We suppose that the statement a > 0 is true and that the statement 1/a > 0 is false. 
Therefore, 1/a < 0. But since a > 0 is true, it follows from the order properties of R that 
a(1/a) < 0. Since 1 = a(1/a), we deduce that 1 < 0. However, this conclusion contradicts 
the known result that 1 > 0. Q.E.D. 


There are several classic proofs by contradiction (also known as reductio ad 
absurdum) in the mathematical literature. One is the proof that there is no rational number 
r that satisfies 7? = 2. (This is Theorem 2.1.4 in the text.) Another is the proof of the 
infinitude of primes, found in Euclid’s Elements. Recall that a natural number p is prime if 
its only integer divisors are 1 and p itself. We will assume the basic results that each prime 
number is greater than 1 and each natural number greater than 1 is either prime or divisible 
by a prime. 


A.6 Theorem (Euclid’s Elements, Book IX, Proposition 20.) There are infinitely many 
prime numbers. 


Proof. If we suppose by way of contradiction that there are finitely many prime numbers, 
then we may assume that S = {p,,...,p,} is the set of all prime numbers. We let 
m= Pi '::- Pp, the product of all the primes, and we let q = m + 1. Since q > p; for 
all i, we see that q is not in S, and therefore q is not prime. Then there exists a prime p that is 
a divisor of q. Since p is prime, then p = p; for some j, so that p is a divisor of m. But if p 
divides both m and q = m + 1, then p divides the difference q — m = 1. However, this is 
impossible, so we have obtained a contradiction. Q.E.D. 
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FINITE AND COUNTABLE SETS 


We will establish the results that were stated in Section 1.3 without proof. The reader 
should refer to that section for the definitions. 

The first result is sometimes called the ‘‘Pigeonhole Principle.” It may be interpreted 
as saying that if m pigeons are put into n pigeonholes and if m > n, then at least two 
pigeons must share one of the pigeonholes. This is a frequently used result in combinatorial 
analysis. It yields many useful consequences. 


B.1 Theorem Letm,n € N withm > n. Then there does not exist an injection from Nm 
into N,. 


Proof. We will prove this by induction on n. 

If n = 1 and if g is any map of N,,(mm > 1) into Nj, then it is clear that g(1) =--- = 
g(m) = 1, so that g is not injective. 

Assume that k > 1 is such that if m > k, there is no injection from N, into Ny. We 
will show that if m > k + 1, there is no function A : Na —> N,,, that is an injection. 


Case 1: If the range A(N,,) C Ny C Ney1, then the induction hypothesis implies that / is 
not an injection of N,, into Nz, and therefore into N41. 


Case 2: Suppose that /(N,,,) is not contained in Nx. If more than one element in Nm 
is mapped into k + 1, then A is not an injection. Therefore, we may assume that a single 
p € Nm is mapped into k + 1 by h. We now define hı : Nm-1 — Nx by 


ma = 7,1) ifg=p,...,m—1. 
Since the induction hypothesis implies that /, is not an injection into Nx, it is easily seen 
that h is not an injection into Nk+1. QED. 


We now show that a finite set determines a unique number in N. 


1.3.2 Uniqueness Theorem Jf S is a finite set, then the number of elements in S is a 
unique number in N. 


Proof. If the set S has m elements, there exists a bijection f, of N,, onto S. If S also has n 
elements, there exists a bijection f, of N, onto S. If m >n, then (by Exercise 21 
of Section 1.1) ee of), is a bijection of N,, onto N,, which contradicts Theorem B.1. 
If n >m, then Ti of, is a bijection of N, onto Nm, which contradicts Theorem B.1. 
Therefore we have m = n. Q.E.D. 


B.2 Theorem Zfn €N, there does not exist an injection from N into N,. 


Proof. Assume that f : N — N, is an injection, and let m := n + 1. Then the restriction of 
fto Nm C N is also an injection into N,. But this contradicts Theorem B.1. QED. 
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1.3.3 Theorem The set N of natural numbers is an infinite set. 


Proof. If N is a finite set, there exists some n € N and a bijection f of N, onto N. In this 
case the inverse function f~! is a bijection (and hence an injection) of N onto N,. But this 
contradicts Theorem B.2. Q.E.D. 


We will next establish Theorem 1.3.8. In connection with the array displayed in 
Figure 1.3.1, the function h was defined by A(m,n) = y(m+n-—2)+m, where 
w(k) =1+2+---+k=ż}k(k+ 1). We now prove that the function A is a bijection. 


1.3.8 Theorem The set N x N is denumerable. 


Proof. We will show that the function / is a bijection. 

(a) We first show that h is injective. If (m,n) Æ (m',n’), then either (i) m+n#F 
m +n’, or Gi) m+ n=m' +n! and m £m. 

In case (i), we may suppose m + n < m' +n’. Then, using formula (1), the fact that y 
is increasing, and m > 0, we have 


h(m,n) = v(m+n—2)+m< v(m+n—2)+(m+n-1) 
= v¥(m+n—1)<wW(m' +n —2) 
< w(m' +n! —2)+m = h(n’). 


In case (ii), if m +n =m +n’ and m Æ m, then 
h(m,n)— m = y(m+n-— 2) = y(m +n! — 2) = h(m, n) -m', 


whence A(m,n) 4 h(m',n’). 

(b) Next we show that / is surjective. 

Clearly h(1,1) = 1. If p € N with p > 2, we will find a pair (mp, np) E N x N with 
h(mp,np) = p. Since p < ¥(p), then the set E, := {k EN: p < w(k)} is nonempty. 

Using the Well-Ordering Property 1.2.1, we let k, > 1 be the least element in E,,. (This 
means that p lies in the k,th diagonal.) Since p > 2, it follows from equation (1) that 


Y (kp = 1) <ps Y (kp) = Y (kp cms 1) + kp. 
Let m, := p — W (kp — 1) so that 1 < m, < kp, and let n, := kp — mp + 1 so that 1 < 
np < kp and mp + np — 1 = kn. Therefore, 
h(mp, np) = W(Mp + np — 2) + my = Y (kp — 1) + m, = p. 


Thus / is a bijection and N x N is denumerable. QED. 


The next result is crucial in proving Theorems 1.3.9 and 1.3.10. 


B.3 Theorem /f A CN and A is infinite, there exists a function ọ : N — A such that 
g(n+1) > (n) > n for all n € N. Moreover, ọ is a bijection of N onto A. 


Proof. Since A is infinite, it is not empty. We will use the Well-Ordering Property 1.2.1 of 
N to give a recursive definition of 9. 

Since A # 0, there is a least element of A, which we define to be (1); therefore, 
g(1) > 1. 

Since A is infinite, the set A; := A\{g(1)} is not empty, and we define (2) to be least 
element of A;. Therefore g(2) > (1) > 1, so that g(2) > 2. 
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Suppose that g has been defined to satisfy p(n + 1) > y(n) > nforn=1,...,k — 1, 
whence g(k) > (k — 1) > k —1 so that (k) > k. Since the set A is infinite, the set 


Ax := A\{g(1),---,9(k)} 


is not empty and we define (k + 1) to be the least element in A;. Therefore (k + 1) > 
(k), and since (k) > k, we also have (k + 1) > k + 1. Therefore, ¢ is defined on all 
of N. 

We claim that g is an injection. If m >n, then m=n+r for some r €N. If 
r = 1, then g(m) = g(n+ 1) > (n). Suppose that y(n + k) > y(n); we will show that 
y(n + (k + 1)) > (n). Indeed, this follows from the fact that y(n + (k + 1)) = g((m+ 
k)+1) > (n+ k) > (n). Since v(m) > y(n) whenever m > n, it follows that g is an 
injection. 

We claim that g is a surjection of N onto A. If not, the set A := A\g(N) is not empty, 
and we let p be the least element in A. We claim that p belongs to the set {g(1),..., 9(p)}. 
Indeed, if this is not true, then 


p € AoC), 9(P)} = Ap, 


so that g(p +1), being the least element in A,, must satisfy y(p + 1) < p. But this 
contradicts the fact that o(p + 1) > (p) > p. Therefore A is empty and ọ is a surjection 
onto A. Q.E.D. 


B.4 Theorem /f A CN, then A is countable. 


Proof. IfA is finite, then it is countable, so it suffices to consider the case that A is infinite. 
In this case, Theorem B.3 implies that there exists a bijection g of N onto A, so that A is 
denumerable and, therefore, countable. Q.E.D. 


1.3.9 Theorem Suppose that S and T are sets and that T C S. 


(a) If S is a countable set, then T is a countable set. 
(b) IfT is an uncountable set, then S is an uncountable set. 


Proof. (a) If S is a finite set, it follows from Theorem 1.3.5(a) that T is finite, and 
therefore countable. If S is denumerable, then there exists a bijection y of S onto N. Since 
w(S) CN, Theorem B.4 implies that Y(S) is countable. Since the restriction of y to Tis a 
bijection onto Y(T) and Y(T) C N is countable, it follows that T is also countable. 


(b) This assertion is the contrapositive of the assertion in (a). Q.E.D. 


APPENDIX C 


THE RIEMANN AND 
LEBESGUE CRITERIA 


We will give here proofs of the Riemann and Lebesgue Criteria for a function to be 
Riemann integrable. First we will give the Riemann Criterion, which is interesting in itself, 
and also leads to the more incisive Lebesgue Criterion. 


C.1 Riemann Integrability Criterion Let f : [a,b] — R be bounded. Then the follow- 
ing assertions are equivalent: 


(a) f € Ria, b]. 
(b) For every e >O there exists a partition P, such that if Pı, Pz are any tagged 
partitions having the same subintervals as P,, then 


(1) IS; Pi) — S(f;Po)| < e. 


(©) For every e > 0 there exists a partition P; = {I;}"_, = {[xi-1, Xx] };; such that if 
m;i := inf{f (x) : x € I;} and M; := sup{f(x) : x € I;} then 


(2) XO (M; — mi) (xi — xi-1) < 2e. 


i=l 


Proof. (a) => (b) Givene > 0, let n, > 0 be as in the Cauchy Criterion 7.2.1, and let P, 
be any partition with ||P,|| < na. Then if Pi, Pa are any tagged partitions with the same 
subintervals as P,, then ||P1]| < n, and ||P2|| < n, and so (1) holds. 

(b) = (c) Given e > 0, let P, = {I;};_; be a partition as in (b) and let m; and M; be 
as in the statement of (c). Since m; is an infimum and M; is a supremum, there exist points u; 
and v; in J; with 


E 
flui) < T E and Mi ba) < f(vi), 
so that we have 
Mi= mi <f) =f) + Gy for i=1,...,n. 


If we multiply these inequalities by (x; — x;-1) and sum, we obtain 


n 


(Mi — mi) (x = x1) < SOF) — FH) xi — HE) + e. 
i=] i=l 
We let Qi := {(Ij,u;)}_, and Qs := {(I;, v;)}"_,, so that these tagged partitions have the 
same subintervals as P, does. Also, the sum on the right side equals S( f; Q2) — S( f; Q1). 
Hence it follows from (1) that inequality (2) holds. 

(c) = (a) Define the step functions œ; and w; on [a, b] by 


at, (Xx) := mi and w(x) = M; for x € (xXi-1, Xi), 


360 


APPENDIX C THE RIEMANN AND LEBESGUE CRITERIA 361 


and @.(x;) :=f(x;) =: @-(x;) fori = 0,1,..., n; then a,(x) < f(x) < w(x) for x € [a,b]. 
Since a, and w; are step functions, they are Riemann integrable and 


b n b n 
f ote = X m(x — xi-1) and J ws = X. Mi(xi — Xi-1). 
a i=l a i=l 


1 


Therefore it follows that 


Since € > 0 is arbitrary, the Squeeze Theorem implies that f € RJa, b]. Q.E.D. 


We have already seen that every continuous function on [a, b] is Riemann integrable. 
We also saw in Example 7.1.7 that Thomae’s function is Riemann integrable. Since 
Thomae’s function has a countable set of points of discontinuity, it is evident that 
continuity is not a necessary condition for Riemann integrability. Indeed, it is reasonable 
to ask “how discontinuous” a function may be, yet still be Riemann integrable. The 
Riemann Criterion throws some light on that question in showing that sums of the form 
(2) must be arbitrarily small. Since the terms (M; — m;) (xi — xi—1) in this sum are all > 0, it 
follows that each of these terms must be small. Such a term will be small if (1) the difference 
M; — m;is small (which will be the case if the function is continuous on the interval [x;—1, x;]) 
or if (ii) an interval where the difference M; — m; is not small has small length. 

The Lebesgue Criterion, which we will discuss next, makes these ideas more precise. 
But first it is convenient to have the notion of the oscillation of a function. 


C.2 Definition Let f :A — R be a bounded function. If SC A C R, we define the 
oscillation of f on S to be 
(3) W(f;S) = sup{|f(x) — f(y) : x,y E€ S}. 

It is easily seen that we can also write 


W(f;S) = sup{ f(x) — f(y) : x,y E€ S} 
= sup{ f(x) : x € S} — inf{ f(x) : x € S}. 


It is also trivial that if SC T C A, then 
0 < W(F;S) < W(F; T) <2-sup{|f(x)| : x € A} 
If r > 0, we recall that the r-neighborhood of c € A is the set 
Vi(c) = {x EA: |x- c| <r}. 


C.3 Definition If c € A, we define the oscillation of f at c by 
(4) w(fic) = inf{W (F; V:(c)) :r > 0} = lim WF; V;(c)). 


Since r+ W( f; V,(c)) is an increasing function for r > 0, this right-hand limit exists and 
equals the indicated infimum. 
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C.4 Lemma Jff :A— R is bounded and c € A, then f is continuous at c if and only if 
the oscillation w(f;c) = 0. 


Proof. (=>) Iff is continuous at c, given ¢ > 0 there exists 5 > 0 such that if x € V,(c), 
then | f(x) —f(c)| < ¢/2. Therefore, if x,y € V,(c), we have | f(x) —f(y)| < £, whence 
0 < w(f;c) < W(f; V,(c)) < e. Since e > 0 is arbitrary, this implies that w( f; c) = 0. 
(<=) Ifw(f;c) = Oande > 0, there exists s > 0 with W(f; Vs(c)) < e. Thus, if |x — c| 
< s then | f(x) —f(c)| < e, and fis continuous at c. QED. 


We will now give the details of the proof of the Lebesgue Integrability Criterion. First 
we recall the statement of the theorem. 


Lebesgue’s Integrability Criterion A bounded function f : |a,b| > R is Riemann 
integrable if and only if it is continuous almost everywhere on [a, b]. 


Proof. (=) Lete >0 be given and, for each k EN, let Hy := {x € [a,b] : w( f; x) > 
1/2}. We will show that H is contained in the union of a finite number of intervals having total 
length < ¢/2*. i 
By the Riemann Criterion, there is a partition P = { |x% ESI 4 such that if m% 

(respectively, M* ; )is the infimum (respectively, supremum) of fon the interval [xe DX x], then 

n(k) 

Simi — m*) (xk — x$ |) < 8/4. 

i=l 
If x € Ay N (x*_,, x*), there exists r > 0 such that V,(x) C (x _,,x*), whence 

1/2* < w(f; x) < Wf; V,(x)) < MY — mt. 


If we denote a summation over those i with Hy (x*_,, x*) # by y, then 
a/y (E =E < Sut — mi (x* — x% 1) < €/4*, 


whence it follows that 


DEEE 6/2". 


Since H% differs from the union of sets Hy N (x* — x*_,) by at most a finite number of the 
partition points, we conclude that H% is contained in the union of a finite number of intervals 
with total length < ¢/2*. 

Finally, since D := {x € [a,b] : w(f;x) > 0} = UZ, Hz, it follows that the set D of 
points of discontinuity of f € Ra, b] is a null set. 

(<=) Let |f(x)| <M for x€ [a,b] and suppose that the set D of points of dis- 
continuity of fis a null set. Then, given € > 0 there exists a countable set {J}; of open 
intervals with DCU, Jk and So, (x) < £/2M. Following R. A. Gordon, we will 
define a gauge on [a, b] that will be useful. 


(i) If ¢¢ D, then fis continuous at ¢ and there exists 5(t) > 0 such that if x € Vsp (t) 
then | f(x) —f(t)| < 6/2, whence 


0 < M, — m, := sup{ f(x) : x € Van (t) } — inf{ f(x) : x € Vaan (Ot < e- 


(ii) Ifż € D, we choose 5(t) > 0 such that V5, (t) C Jg for some k. For these values of t, 
we have 0 < M; — m, < 2M. 
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Thus we have defined a gauge ô on [a, b]. If P = {([xi_-1, xi], ti) 4 is a 6-fine partition 
of [a, b], we divide the indices i into two disjoint sets 


Se := {i:ti¢ D} and Sq := {i: t; € D}. 
If È is 5-fine, we have [xi-1, Xi] E Voc, (ti), whence it follows that M; — m; < M;, — m;,. 
Consequently, if i € Se then M; — m; < €, while if i € S4 we have M; — mi < 2M. 


However, the collection of intervals [x;-1, x;] with i € Sy are contained in the union of 
the intervals {J4} whose total length is < ¢/2M. Therefore 


n 


>m; = mi)(xi — xi-1) 


i=l 


=X (M i — m;i) (Xi — Xi-1 )+5 mM i —m;)( Xi — Xi-1) 


IES i€Sq 
< X a(x i — x1) + S$ > 2M(x; — Xi-1) 
i€S. i€Sa 


< e(b — a) +2M - (e/2M) < e(b— a+ 1). 


Since ¢ > 0 is arbitrary, we conclude that f € R{a, b]. QED. 


APPENDIX D 


APPROXIMATE INTEGRATION 


We will supply here the proofs of Theorems 7.5.3, 7.5.6, and 7.5.8. We will not repeat the 
statement of these results, and we will use the notations introduced in Section 7.5 and refer 
to numbered equations there. It will be seen that some important results from Chapters 5 
and 6 are used in these proofs. 


Proof of Theorem 7.5.3. fk =1,2,...,n, leta,: =a+(k — 1)h and let øx : [0,4] + R 
be defined by 


Gx(t) = tlf (ax) +f lak + 9] - f(x)dx 


for t € [0, A]. Note that g,(0) = 0 and that (by Theorem 7.3.6) 
allak) +S lak + t) + atf (ae + t) = flak + t) 
= 3 [f (ax) -flar +A] +t (aK + 0). 
Consequently gj.(0) = 0 and 
Q(t) = -3f (ae + t) +4 f (at Hif” lakt t) 
= Lia + t). 
Now let A, B be defined by 
= inf{f"(x): x€ [a,b]},  B:=sup{f"(x) : x € [a,b]} 


so that we have $At < y(t) < ŁBt for t € [0, h], k = 1,2,...,n. Integrating and applying 
Theorem 7.3.1, we obtain (since ø,(0) = 0) that 1A? <9j,(t) <4BrP for te [0,h], 
k = 1,2,...,n. Integrating again and taking t = h, we obtain (since g;,,(0) = 0) that 


g(t )= 


AN? < g(h) < 5B 
for k = 1, 2,..., n. If we add these inequalities and note that 


b 


FOOTE] E 
k=1 a 


b 
we conclude that bAh’n < T,(f) -f f(x)dx < Bhn. Since h = (b — a)/n, we 
have g 


bA(b -ah < T,(f)— | f(x)dx < Bb- ah’. 


Since f” is continuous on [a, b], it follows from the definitions of A and B and Bolzano’s 
Intermediate Value Theorem 5.3.7 that there exists a point c in [a, b] such that equation (4) 
in Section 7.5 holds. Q.E.D. 
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Proof of Theorem 7.5.6. fk =1,2,...,n, let cy := a + (k — Ł})h, and yy : [0,44] +R 
be defined by 


w= fo “flax —f(ce)2 


ck— 


for t € [0,4h]. Note that y,(0) = 0 and that since 


w= f" Tamad TORTA 


Ck Ck 


we have 


W(t) = flek + £) —f (ce — t)(—1) — 2f (cx) 
= [fF (ce + t) +f(ce — 0] — 2f (cx). 


Consequently (0) = 0 and 
Wilt) =F (ck +t) +f (ck — 1)(-1) 
= f' (Ck +t) = f (ck — t). 


By the Mean Value Theorem 6.2.4, there exists a point Cx, with |c, — Ck| < t such that 
Wit) = 2tf" (ck 1). If we let A and B be as in the proof of Theorem 7.5.3, we have 21A < 
Wi(t) < 2tB for t € [0,h/2],k = 1,2,...,n. It follows as before that 


IAP < v(t) < {BP 
for all t € [0,44], k = 1,2,..., n. If we put t = 5h, we get 
EE W(th) < 1 gr 
DRO Pe aT 
If we add these inequalities and note that 
n b 
So vulgm) = f Fod- Mal), 
k=1 a 


we conclude that 


ts 4 Var 
< — K . 
gans f f(x)dx - M,(f) < zzBhn 


If we use the fact that h = (b — a) /n and apply Bolzano’s Intermediate Value Theorem 5.3.7 
to f” on [a, b] we conclude that there exists a point y € Ja, 5] such that (7) in Section 7.5 
holds. Q.E.D. 


Proof of Theorem 7.5.8. If k =0,1,2,...,3n — 1, let c := a + (2k + 1)h, and let ø : 
[0, A] — R be defined by 


aO = Hla) +4flex) fat] [Fade 


Evidently y,(0) = 0 and 
PCE) = 3 tlf (ce = A) ES (ce +0] — 31 (ce — 2) — 2F (cx) HEC + D), 
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so that g/,(0) = 0 and 
OD) = 3F (ce = DAS (Ce + D) alf Cer = A) +f (e+), 
so that ø} (0) = 0 and 
Pe (t) = 30" (ck + £) = f” (ce — 0). 


Hence it follows from an Mean Value Theorem 6.2.4 that there is a y;,, with |ck — Yk sl < t 
such that g% (t) =2°f“ (y;,,). If we let A and B be defined by 


A:= int {4 (x) : x € [a, b]} and = B := sup{f® (x) : x € [a, I}, 
then we have 
ZAP < Pl < $B? 
for ¢ € [0,4], k = 0,1,... sin — 1. After three integrations, this inequality becomes 


1 1 
—AP < <— B" 
90 g(t ) 90 


for all ¢ € [0,4], k =0,1,...,3n — 1. If we put t = h, we get 


Aas <o,(h < sh 
90 Prl ) 


fork =0,1,..., — 1. If we add these } z7 inequalities and note that 


$n-1 


b 
DOELOE / f(x)dx 
k=0 a 


we conclude that 
lA <s,(f) fr dx < Bh. 
— =, n — X — 
90 Dee A 
Since h = (b — a)/n, it follows from Bolzano’s Intermediate Value Theorem 5.3.7 


(applied to f (®) that there exists a point c € |a, b] such that the relation (10) in Section 7.5 
holds. Q.E.D. 


APPENDIX E 


TWO EXAMPLES 


In this appendix we will give an example of a continuous function that has a derivative at no 
point and of a continuous curve in R? whose range contains the entire unit square of R’. 
Both proofs use the Weierstrass M-Test 9.4.6. 


A Continuous Nowhere Differentiable Function 


The example we will give is a modification of one due to B. L. van der Waerden in 1930. 
Let fy : R — R be defined by fo(x) := dist(x, Z) = inf{|x — k| : k € Z}, so that fo is a 
continuous “sawtooth” function whose graph consists of lines with slope +1 on the 
intervals [k/2, (k + 1)/2], k € Z. For each m € N, let f „ (x) := (1/#")fo(4" x), so that fn 
is also a continuous sawtooth function whose graph consists of lines with slope +1 and 
with 0 < f(x) < 1/(2 - 4”). (See Figure E.1.) 


0 1 1 3 1 


i 
16 4 2 4 
Figure E.1 Graphs of fo, fı, and fa. 


We now define g : R > R by g(x) := 5 f m(x). The Weierstrass M-Test implies that 
m=0 


the series is uniformly convergent on R; hence g is continuous on R. We will now show that 
g is not differentiable at any point of R. 

Fix x € R. For each n € N, let h, := +1/4”*', with the sign chosen so that both 4”x 
and 4” (x + h,„) lie in the same interval [ķ/2, (k + 1)/2]. Since fọ has slope +1 on this 
interval, then 


falx + hy) — fa(x) _fo(4"x Ej 4"h,) — fo(4"x) 


hy 4h, — +1. 


Enic 


In fact if m < n, then the graph of f,,, also has slope +1 on the interval between x and x + A, 
and so 
mee: + hy) = Fin) 


Em = ell fi <n. 
é ie or m n 
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Lqgm—n—l 


On the other hand, if m > n, then 4” (x + hn) — 4"x = 
has period equal to 1, it follows that 


Fm(% + hn) = fim(X) = 0. 


is an integer, and since fo 


Consequently, we have 


sat ty) Day ttl =f ile = Joen 


whence the difference quotient (g(x + A,) — g(x))//, is an odd integer if n is even, and an 
even integer if n is odd. Therefore, the limit 


does not exist, so g is not differentiable at the arbitrary point x € R. 


A Space-Filling Curve 


We will now give an example of a space-filling curve that was constructed by I. J. Schoenberg 
in 1938. Let g: R — R be the continuous, even function with period 2 given by 


0 for O0<t< 1/3, 
g(t) := <¢ 3t—1 for 1/3<t< 2/3, 
1 for 2/3<t< 1. 
(See Figure E.2.) For t € [0, 1], we define the functions 
l o0 (3°?) oo (3t) 
WEE ma s = HTD 
k=0 k=0 


Since 0 < g(x) < 1 and is continuous, the Weierstrass M-Test implies that f and g are 
continuous on [0, 1]; moreover, 0 < f(t) < 1 and0 < g(t) < 1. We will now show that an 
arbitrary point (Xo, yo) in [0, 1] x [0, 1] is the image under ( f, g) of some point tọ € [0, 1]. 
Indeed, let xp and yo have the binary (= base 2) expansions: 


x= stata and r set aa 


where each a; equals 0 or 1. It will be shown that x9 =f (to) and yo = g(t), where fo has the 
ternary (= base 3) expansion 


X 2ar 2aọ 2a, 2a 2a 
to = = 4 tt St 
o 3 get 3 3° 33 34 


-1 2 1 0 1 2 1 


3 3 3 3 
Figure E.2 Graph of o 


APPENDIX E TWO EXAMPLES 369 


First, we note that the above formula does yield a number in [0, 1]. We also note that if ag = 0, 
then 0 < to < 1/3 so that (to) = 0, and if ap = 1, then 2/3 < tọ < 1 so that g(t) = 1; 
therefore, in both cases (ao) = ao. Similarly, it is seen that for each n € N there exists 
my, € N such that 


2an  2An41 
3 3° 


whence it follows from the fact that g has period 2 that (3"fo) = an. Finally, we conclude 
that 


3”to = 2m, 4 fees, 


and 


Therefore xo = f (to) and yo = g(to) as claimed. 
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HINTS FOR SELECTED EXERCISES 


Reader: Do not look at these hints unless you are stymied. However, after putting a 
considerable amount of thought into a problem, sometimes just a little hint is all that is 
needed. Many of the exercises call for proofs, and there is usually no single approach that is 
correct, so even if you have a totally different argument, yours may be correct. Very few of 
the following hints give much detail, and some may seem downright cryptic at first. 
Somewhat more detail is presented for the earlier material. 


Section 1.1 


1. 


(a) {5, 11, 17} 
Show that if A C B, then A = AM B. Next show that if A = A N B, then A C B. 


Show that if x € A\(B N C), then x € (A\B) U (A\C). Next show that if y € (A\B) U (A\C), 
then y € A\(B N C). Since the sets A\(B N C) and A\(B N C) contain the same elements, they 
are equal. 

(a) A, NA = {6, 12, 18, 24,...} = {6k : k € N} =As. 

(b) UA, =N\{1} and NA, = 0. 

No. For example, both (0, 1) and (0, —1) belong to C. 


11. (a) f(E) = [2, 3], so A(E) = g(F(E)) = 8([2,3)) = [4,9]. 
©) 87'(G) = [-2, 2], so 47! (G) = [-4, 0]. 

15. Ifx ef !(G) nf! (H), then x € f~! (G) and x € f7! (H), so that f(x) € Gandf(x) € H. Then 
f(x) € GNH, and hence x € f! (GN A). This shows that f !(G) Af! (H) Cf (GNA). 

17. One possibility is f(x) := (x — a)/(b — a). 

21. If g(f(x1)) = e(f(x2)), then f(x) =f(x2), so that xı = x2, which implies that g o f is 
injective. If w € C, there exists y € B such that g(y) = w, and there exists x € A such that 
f(x) = y. Then g(f(x)) = w, so that g o f is surjective. Thus g o fis a bijection. 

22. (a) Iff(x,) = f(x2), then g(f(x1)) = g(f(x2)), which implies xı = x2, since g o fis injective. 

Thus f is injective. 
Section 1.2 
1. Note that 1/(1-2) = 1/(1 + 1). Also k/(k + 1) + 1/[(k + 1)(k + 2)] = (k + 1)/(k + 2). 
2. [Ekk +1)? + (K+ 1)? = [ikk]. 
4. (4K — k) + (2k +1} =} [4(k + 17 — (k + 1)]. 
6. (k+1) +5(k+1)= (kK + 5k) + 3k(k + 1) + 6 and k(k + 1) is always even. 
8. 5%! — 4k 41)-1=5-5* —4k-—5 = (5* — 4k — 1) +.4(5* — 1). 

13. If k < 2%, then k +1 <2% +1 <24 +2" =2(2*) = 2, 

16. Itis true for n = 1 and n > 5, but false for n = 2, 3, 4. 

18. vk+1/vVk+1= (vkvk+1+1)/vk+1>(k+1)/vk+1=vk+l1. 
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Section 1.3 


1. Use Exercise 1.1.21 (= Exercise 21 of Section 1.1). 

2. Part (b) Let f be a bijection of N,, onto A and let C = { f(k)} for some k € Nm. Define g on 
Nm-1 by g(i) := f (i) fori =1,...,4 — 1, and g(i) := f(i + 1) for i = k,...,m — 1. Then gisa 
bijection of N,,_; onto A\C. 

3. (a) There are 6 = 3 - 2 - 1 different injections of S into T. 

(b) There are 3 surjections that map a into 1, and there are 3 other surjections that map a 
into 2. 

7. IfT, is denumerable, take T = N. If fis a bijection of T; onto T2, and if g is a bijection of T2 
onto N, then (by Exercise 1.1.21) g o fis a bijection of T, onto N, so that Tı is denumerable. 

9. If SAT=0 andf:N—S, g:N-—T are bijections onto S and T, respectively, let h(n) := 
f((n + 1)/2) if n is odd and h(n) := g(n/2) if n is even. 

11. (a) P({1, 2}) = {0, {1}, {2}, {1, 2}} has 2? = 4 elements. 

(c) P({1, 2, 3, 4}) has 24 = 16 elements. 

12. Let Spar := {X1,.--,Xn, Xn41} = Sn U {X41} have n+ 1 elements. Then a subset of Sy+1 
either (i) contains x„+1, or (ii) does not contain x„+1. There are a total of 2” + 2” = 2.2" = 2”+! 
subsets of S11. 

13. For each m € N, the collection of all subsets of N,, is finite. Note that F(N) = U% P(N). 

Section 2.1 

1. (a) Justify the steps in: b = 0+ b = (—a + a) + b a+ (a+b) =-a+0=-a. 

(c) Apply (a) to the equation a + (—1)a = a(1 + (—1))=a-0=0 

2. (a) —(a+ b) = (—1)(a +b) = (—1)a + (—1)b = (—a) + (—b) 

(c) Note that (—a)(—(1/a)) = a(1/a) = 1. 

3. (a) 3/2 (b) 0,2 
(c) 2, -2 (d) 1,-2 
Note that if q € Z and if 3q? is even, then q? is even, so that q is even. 

If p € N, then there are three possibilities: for some m € N U {0}, (i) p = 3m, (ii) p = 3m + 1, 
or (iii) p = 3m + 2. 

10. (a) If c=d, then 2.1.7(b) implies a+c<b+d.Ifc<d,thena+c<b+c<b4d. 

13. If a4, then 2.1.8(a) implies that a? > 0; since b? > 0, it follows that a2 + b? > 0. 

15. (a) If0 <a <b, then2.1.7(c) implies that 0 < a? < ab < b*. Then by Example 2.1.13(a), we 

infer that a= Va? < vab < Vb’? =b. 
16. (a) {x:x>4orx < —1}. (b) {x:1<x<2o0r-2<x<-l}. 
(c) {x:-l1<x<0orx> ll}. (d) {x:x<Oorx > l}. 
19. The inequality is equivalent to 0 < a? — 2ab + b = (a — by)”. 
20. (a) Use 2.1.7(c). 
21. (a) LetS:={nEN:0<n< 1}. If Sis not empty, the Well-Ordering Property of N implies 
there is a least element m in S. However, 0 < m < 1 implies that 0 < m? < m, and since 
m° is also in S, this is a contradiction of the fact that m is the least element of S. 

22. (a) Let x:=c—1> 0 and apply Bernoulli’s Inequality 2.1.13(c). 

24. (a) Ifm>n,thenk := m -— n € N, and ck > c> 1, which implies that c” > c”. Conversely, 


the hypotheses that c” > c” and m < n lead to a contradiction. 
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25. Let b := c!/"™ and show that b > 1. Exercise 24(a) implies that c!” = b” > b” = c!/™ if and 
only if m >n. 


26. Fix m € N and use Mathematical Induction to prove that at" = aa" and (a’”)" = a™ for all 
n € N. Then, for a given n € N, prove that the equalities are valid for all m € N. 


Section 2.2 


1. (a) If a> 0, then |a| = a = Va’; if a < 0, then |a| = —a = Væ. 
(b) It suffices to show that |1/b| = 1/|b| for b 4 0 (why?). Consider the cases b > 0 and 
b<0. 


3. If x<y<z, then |x- y| + |y- z| = (y — x) + (z — y) = z — x = |z — x|. To establish the 
converse, show that y < x and y > z are impossible. For example, if y < x < z, it follows from 
what we have shown and the given relationship that |x — y| = 0, so that y = x, a contradiction. 


6. (a) -2< x< 9/2 b) -2<x<2. 

7. x=4orx = —3. 

10. (a) x<0 b) —-3/2<x<1/2. 

12. {x:-3 < x< —5/20r3/2 < x < 2}. 

13. {x:1<x< 4}. 

14. (a) {(x, y) :y = +x}. (c) The hyperbolas y = 2/x and y = —2/x. 


15. (a) Ify > 0, then —y < x < y and we get the region in the upper half-plane on or between the 
lines y = x and y = —x. 


18. (a) Suppose that a < b. 


19. If a<b<c, then mid{a, b, c} = b = min{b, c, c} = min{max{a, b}, max{b,c}, max 
{c, a}}. The other cases are similar. 


Section 2.3 


1. Since 0 < x for all x € S4, then u = 0 is a lower bound of S,. If v > 0, then v is not a lower 
bound of Sı because v/2 € Sı and v/2 < v. Therefore inf Sı = 0. 


Since 1/n < 1 for all n € N, then 1 is an upper bound for $3. 
4. sup S4 = 2 and inf Sy = 1/2. 
Let u € S be an upper bound of S. If v is another upper bound of S, then u < v. Hence u = sup S. 


10. Letu := sup A, v := sup B, and w := sup{u, v}. Then w is an upper bound of A U B, because if 
x € A, then x < u < w, and if x € B, then x < v < w. If zis any upper bound of A U B, then z is 
an upper bound of A and of B, so that u<z and v< z. Hence w < z. Therefore, 
w = sup(A U B). 


12. Consider two cases: u > s* and u < s*. 


Section 2.4 


1. Since 1 — 1/n < 1foralln € N, 1 is an upper bound. To show that 1 is the supremum, it must be 
shown that for each e > 0 there exists n € N such that 1 — 1/n > 1 — e, which is equivalent to 
1/n < e. Apply the Archimedean Property 2.4.3 or 2.4.5. 
inf S = —1 and sup S = 1. 


4. (a) Letu:=supSanda > 0. Then x < u forall x € S, whence ax < au for all x € S, whence 
it follows that au is an upper bound of aS. If v is another upper bound of aS, then ax < v for 
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all x € S, whence x < v/a for all x € S, showing that v/a is an upper bound for S so that 
u < v/a, from which we conclude that au < v. Therefore au = sup(aS). 


6. Letu := sup f(X). Then f(x) < u for all x € X, so that a + f(x) < a + u forall x € X, whence 
sup{a+ f(x): x EX} <a+u. If w<a+u, then w—a < u, so that there exists x, € X 
with w—a < f(Xw), whence w<a+f(x,), and thus w is not an upper bound for 
{a+ f(x): x €X}. 


8. Ifu:= supf(X) and v := sup g(X), then f(x) < u and g(x) < v for all x € X, whence f(x) + 
g(x) < u+ v for all x € X. 
10. (a) f(x) =1forx €X. (b) g(y) =Oforye Y. 


12. Let S:= {h(x,y):x€X,yeY}. We have A(x,y) < F(x) for all x€X,y€Y so that 
sup S < sup{ F(x) : x € X}. If w < sup{F(x) : x € X}, then there exists xo € X with w < 
F(xo) = sup{h(xo, y) : y € Y}, whence there exists yọ € Y with w < A(xo, yọ). Thus w is not an 
upper bound of S, and so w < sups. Since this is true for any w such that 
w < sup{F(x) : x E€ X}, we conclude that sup{F (x) : x € X} < sup S. 


14. Note that n < 2” (whence 1/2” < 1/n) for any n € N. 


15. Let $3 := {s ER:0<5,9°< 3}. Show that $3 is nonempty and bounded by 3 and let 
y := sup $3. If y? <3 and 1/n < (3—y’)/(2y +1) show that y+ 1/n € $3. If y? >3 and 
1/m < (y? — 3)/2y show that y — 1/m € S3. Therefore y? = 3. 


18. Ifx <0 < y, then we can take r = 0. If x < y < 0, we apply 2.4.8 to obtain a rational number 
between —y and —x. 


Section 2.5 


2. Shas an upper bound b and a lower bound a if and only if S is contained in the interval fa, b]. 

4. Because z is neither a lower bound nor an upper bound of S. 

5. Ifz € R, then zis not a lower bound of S so there exists x, € S such that x, < z. Similarly, there 
exists y, E€ S such that z < y,. 

8. If x > 0, then there exists n € N with 1/n < x, so that x ¢ Jy. If y < 0, then y ¢ Jj. 


10. Let 1 := inf{b, : n € N}; we claim that a, < n for all n. Fix n € N; we will show that a, is a 
lower bound for the set {by : k € N}. We consider two cases. (i) If n < k, then since I, D Ip, we 
have an < ap < bx. (ii) If k < n, then since I% D I,, we have an < by < by. Therefore ay < bk 
for all k € N, so that a, is a lower bound for {by : k € N} and so a, < n. In particular, this 
shows that 7 € [a,, bn] for all n, so that n € N Zn. 


12. 3=(.011000...), = (.010111...),. %= (.0111000...), = (.0110111...),. 


13. (a) + (.0101),. (b) += (.010101...),, the block 01 repeats. 
16. ; = .142857..., the block repeats. 4 = .105 263 157 894 736 842..., the block repeats. 
17. 1.25 137...137... = 31253/24975, 35.14653...653... = 3511139/99900. 
Section 3.1 
1. (a) 0, 2,0, 2,0 (c) 1/2, 1/6, 1/12, 1/20, 1/30 
3. (a) 1,4, 13, 40, 121 (c) 1, 2, 3,5, 4. 


5. (a) We have 0 < n/(n? +1) < n/n? = 1/n. Given e > 0, let K(e) > 1/e. 
(c) We have |(3n+1)/(2n+5) —3/2| = 13/(4n+ 10) < 13/4n. Given ¢>0, let 
K(e) > 13/4e. 


6. (a) 1/vVn+}7 < 1/yn (b) |2n/(n+2)-—2|=4/(n+2) <4/n 
© vn/(n+1)<1/yn (d) |(-1)"n/(n? + 1)| < 1/n. 
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9. Ofte < E40 < Xn < 8. 
11. |1/n— 1/(n+1)| = 1/n(n+ 1) < 1/n? < 1/n. 


14. Let b := 1/(1 + a) where a > 0. Since (1 + a)" >4n(n— 1)a?, we have that 
0 < nb" < n/[ġn(n— 1)a’] < 2/[(n — 1)a’]. Thus lim(nb") = 0. 


16. If n > 4, then 0 < n?/n! < n/(n—2)(n— 1) < 1/(n— 3). 


Section 3.2 
1. (a) lim(x,)= 1 (c) Xn >n/2, so the sequence diverges. 
3. Y=(X+Y)-X. 
6. (a) 4 (b) 0 (c) 1 (d) 0. 
8. In (3) the exponent k is fixed, but in (1 + 1/n)” the exponent varies. 
9. lim(y,) = 0 and lim(/ny,) = $. 


12. b. 

14. (a) 1 (b) 1 

16. (a) L=a (b) L=b/2 (c) L=1/b (d) L=8/9. 
19. (a) Converges to 0 (c) Converges to 0. 

21. (a) (1) (b) (n). 


22. Yes. (Why?) 


23. From Exercise 2.2.18, un = $ (Xn + Yn + [Xn — Ynl)- 
24. Use Exercises 2.2.18(b), 2.2.19, and the preceding exercise. 


Section 3.3 
1. (xn) is a bounded decreasing sequence. The limit is 4. 
2. The limit is 1. 

3. The limit is 2. 

4. The limit is 2. 

5. (Yn) is increasing. The limit is y = (1 + v1 +F 4p). 

7 


(xn) is increasing. 


10. Note y, =1/(n+1)+1/(n+2)+---+1/2n < 1/(n+1)4+-1/(n4+1) +--+ 1/(n4 


=n/(n+1) <1. 
12. (a) e (b) e (c) e (d) 1/e. 


13. Note that if n > 2, then 0 < 5s, — V2 < so — 2. 

14. Note that 0 <s, — v5 < (s2 — 5) /V5 < (s3 —5)/2. 

15. e, = 2.25, e4, = 2.441 406, eg = 2.565 785, €16 = 2.637 928. 
16. eso = 2.691588, eio = 2.704814, eooo = 2.716 924. 


Section 3.4 
1. For example x2,_) := 2n — 1 and x, := 1/2n. 
3. L=4(1+ V5). 
7. (a) e b) e! (c) e (da) e. 


8. (a) 1 b) 2., 
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12. Choose n; > 1 so that |x,,| > 1, then choose m > n, so that |x,,| > 2, and, in general, choose 
nk > ng- so that |Xp,| > k. 

13. (x21) = (—1,—1/3, —1/5,...). 

14. Choose nı > 1 so that x,, > s — 1, then choose m > nı so that x,, > s — 1/2, and, in general, 
choose nk > nk—ı so that Xn, > s — 1/k. 

Section 3.5 

1. For example, ((—1)"). 

3. (a) Note that |(~1)" — (-1)"*'| = 2 for all n € N. 
(c) Take m = 2n, SO Xm — Xn = Xan — Xn = ln 2n — Inn = ln 2 for all n. 

5. lim(yn+ 1 -— yn) = 0. But, if m = 4n, then v4n — yn = yn for all n. 

8. Let u := sup{x, : n E€ N}. If e > 0, let H be such that u — € < xy < u. If m > n > H, then 
U — E< Xn Š Xm <u so that |Xm — Xn| < €. 

10. lim(x,) = (1/3)xı + (2/3)x2. 12. The limit is V2 — 1. 

13. The limit is 1 + v2. 

14. Four iterations give r = 0.201 64 to 5 places. 

Section 3.6 

1. If {x, :n € N} is not bounded above, choose m4) > nx such that xn, > k fork € N. 
3. Note that |x, — O| < ¢ if and only if 1/x, > 1/e. 
4. (a2) [vn > a] [n> a?] (c) Vn—1> yn/2whenn > 2. 
8. @ n< (+2). 
(c) Since n < (n? + ie then n!/? < (n? + 1)? fn, 
9. (a) Since x,/y,, — œ, there exists Kı such that if n > Kı, then x, > y,,. Now apply Theorem 
3.6.4(a). 
Section 3.7 
1. The partial sums of `b, are a subsequence of the partial sums of X `ap. 
3. (a) Since 1/(n+ 1)(n+2) = 1/(n+ 1) — 1/(n+ 2), the series is telescoping. 
6. (a) 4/35. 
9. (a) The sequence (cos n) does not converge to 0. 
(b) Since | (cos n) /n?| < 1/n’, the convergence of X` (cos n) /n? follows from Example 3.7.6 
(c) and Theorem 3.7.7. 

10. The “even” sequence (s2,) is decreasing, the “odd” sequence (52,11) is increasing, and 
—1 <5, < 0. Also 0 < Say — Sone, = 1/V2n + 1. 

12. 5>1/n? is convergent, but © 1/n is not. 

14. Show that by > a; /k for k € N, whence bi +--- +b, >ai(1+---+1/n). 

15. Evidently 2a(4) < a(3) + a(4) and 27a(8) < a(5) + --- + a(8), etc. Also a(2) + a(3) < 2a(2) 
and a(4) + +--+ a(7) < 2?a(27), etc. The stated inequality follows by addition. Now apply the 
Comparison Test 3.7.7. 

17. (a) The terms are decreasing and 2”/2” In(2”) = 1/(n1In2). Since }°1/n diverges, so does 

S71/(n Inn). 
18. (a) The terms are decreasing and 2”/2”(In 2”)° = (1/n‘) - (1/In2)°. Now use the fact that 


XO (1/n°) converges when c > 1. 
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Section 4.1 


1. 


(a-c) If |x—1| <1, then |x+1| <3 so that |x? — 1| < 3ļx— 1|. Thus, |x—1| < 1/6 
assures that |x? — 1| < 1/2, etc. 
(d) If |x —1| <1, then |x? — 1| < 7ļ|x— 1|. 


2. (a) Since |yx-— 2| = |x — 4|/(\/x + 2) < |x — 4|, then |x — 4| < 1 implies that we have 
|vx -2| <4. 
(b) If |x —4| <2 x 107? = .02, then |,/x — 2| < .01. 
5. If O<x <a, then 0<x+c<a+c< 2a, so that |x? — | = |x + ¢||x — c| < 2aļx — cl. 
Given ¢ > 0, take ô := ¢/2a. 
8. Ifc 40, show that |,/x — Vc] < (1//c)|x — c|, so we can take 5 := ey/¢. If c = 0, we can take 
b:= 8. 
9. (a) If|x —2| < 1/2 show that |1/(1 — x) + 1| = |(x — 2)/(x — 1)| < 2|x — 2|. Thus we can 
take ô := inf{1/2, ¢/2}. 
(c) If x £0, then |x?/|x| — 0| = |x|. Take ô := e. 
10. (a) If |x—2| <1, then |x? + 4x — 12| = |x + 6ļlx— 2| < 9|x — 2|. We may take 5:= 
inf{1,¢/9}. 
(b) If |x+1| < 1/4, then |(x + 5)/(3x +2) — 4| = 7|x + 1|/|2x + 3| < 14|x + 1|, and we 
may take 6 := inf{1/4,¢/14}. 
12. (a) Let x, := 1/n. (c) Let x, := 1/nandy, := —1/n. 
14. (b) If f(x) := sgn(x), then lim (f(x)? = 1, but lim f(x) does not exist. 
a x! 
15. (a) Since | f(x) — 0| < |x|, we have lim f(x) = 0. 
Pa 
(b) If c 4 Ois rational, let (x,) be a sequence of irrational numbers that converges to c; then 
f(c) =c#0=lim(f(x,)). What if c is irrational? 
17. The restriction of sgn to [0, 1] has a limit at 0. 
Section 4.2 
1. (a) 10 b) -3 (c) 1/12 (d) 1/2. 
2 @) 1 (b) 4 (c) 2 (d) 1/2. 
3. Multiply the numerator and denominator by V1 + 2x + v1 + 3x. 
4. Consider x, := 1/27 n and cos(1/x,) = 1. Use the Squeeze Theorem 4.2.7. 
8. If |x| <1, k EN, then |x*| = |x|* < 1, whence =x? < x+? < 32. 
11. (a) No limit (b) 0 (c) No limit (d) 0. 
Section 4.3 
2. Let f(x) := sin(1/x) for x < 0 and f(x) := 0 for x > 0. 
3. Given a > 0, if 0 < x < 1/a’, then yx < 1/a, and so f(x) > a. 
5. (a) Ifa>1and1<x<a/(a-—1), then æ < x/(x— 1), hence we have 
lim x/(x — 1) = œ. 
x1+ 
(c) Since (x + 2)/./x > 2/,/x, the limit is oo. 
(e) If x >0, then 1/yx < (vx + 1)/x, so the right-hand limit is oo. 
(g) 1 h) —I. 
8. Note that | f(x) — L| < e for x > K if and only if |f(1/z) — L| < e for 0 < z < 1/K. 
9. There exists œ > 0 such that |xf(x) — L| < 1 whenever x > a. Hence | f(x)| < (|L| + 1)/x 


for x >a. 
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12. No. If A(x) := f(x) — g(x), then lim A(x) = 0 and we have 
x—0o 
fe) = 1+ (x) /g() > 1. 

13. Suppose that | f(x) — L| < e for x > K, and that g(y) > K for y > H. Then 
|fog(y) —L| < e for y > H. 

Section 5.1 

4. (a) Continuous if x #0, +1, +2,... (b) Continuous if x Tok 2s 
(c) Continuous if sin x Æ 0,1 (d) Continuous if x # 0,1, +1/2,.... 
7. Lete := f(c)/2, and let ô > 0 be such that if |x — c| < 4, then | f(x) — f(c)| < £, which implies 
that f(x) > f(c) — e = f(c) /2 > 0. 
8. Since f is continuous at x, we have f(x) = lim(f(x,)) = 0. Thus x € S. 

10. Note that [ixi — lel < |x- el. 

13. Since |g(x) — 6| < sup{|2x — 6], |x — 3|} = 2|x — 3|, g is continuous at x = 3. If c ¥ 3, let 
(Xn) be a sequence of rational numbers converging to c and let (y,,) be a sequence of irrational 
numbers converging to c. Then lim(g(x,)) 4 lim(g(y,,)). 

Section 5.2 

1. (a) Continuous on R (c) Continuous for x Æ 0. 
2. Use 5.2.1(a) and Induction, or use 5.2.7 with g(x) := x”. 

4. Continuous at every noninteger. 

7. Let f(x) := 1 if x is rational, and f(x) := —1 if x is irrational. 

12. First show that f(0) =0 and f(—x) = —f(x) for all x€ R; then note that f(x — xo) = 
f(x) —f(xo0). Consequently f is continuous at the point xo if and only if it is continuous at 
0. Thus, if fis continuous at xo, then it is continuous at 0, and hence everywhere. 

13. First show that f(0) = 0 and (by Induction) that f(x) = cx for x € N, and hence also for x € Z. 
Next show that f(x) = cx for x € Q. Finally, if x ¢ Q, let x = lim(r„) for some sequence in Q. 

15. Iff(x) > g(x), then both expressions give h(x) = f(x); and if f(x) < g(x), then h(x) = g(x) in 
both cases. 

Section 5.3 

1. Apply either the Boundedness Theorem 5.3.2 to 1/f, or the Maximum-Minimum Theorem 5.3.4 
to conclude that inf f(Z) > 0. 

3. Choose a sequence (xn) such that | f(xn+41)| < IFO) < (2)"|f(x1)|. Apply the Bolzano- 
Weierstrass Theorem to obtain a convergent subsequence. 

4. Suppose that p has odd degree n and that the coefficient a, of x” is positive. By 4.3.16, 

lim p(x) = ocoand lim p(x) = —oo. 

P tes ct, 2] x——-0O 
5. In the intervals [1.035, 1.040] and [—7.026, —7.025]. 
7. In the interval [0.7390, 0.7391]. 
8. In the interval [1.4687, 1.4765]. 

9. (a) 1 (b) 6. 

10. 1/2” < 107% implies that n > (51n 10)/In2 ~ 16.61. Take n = 17. 

11. Iff(w) < 0, then it follows from Theorem 4.2.9 that there exists a ô -neighborhood V5(w) such 


that f(x) < 0 for all x € Vs(w). 
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14. Apply Theorem 4.2.9 to B — f(x). 


15. If0<a<b < œ, then f((a, b)) = (@ Bia if -oo < a < b < 0, then f((a, b)) = (P,a). 
Ifa < 0 < b, then f((a, b)) is not an open interval, but equals [0, c) where c := sup{a’, b? b. 
Images of closed intervals are treated similarly. 


16. For example, if a < 0 < b and c := inf{1/(a° +1), 1/(b° + 1)}, then g((a, b)) = (c, 1]. If 
0<a<b, then g((a,b)) = (1/(b° +1), 1/(a + 1)). Also g([-1, 1]) = [1/2, 1]. If a < b, 
then h((a, b)) = (a’, b°) and A((a, b]) = (eb: 

17. Yes. Use the Density Theorem 2.4.8. 

19. Consider g(x) := 1/x for x € J := (0, 1). 


Section 5.4 
1. Since 1/x — 1/u = (u — x)/xu, it follows that |1/x — 1/u| < (1/a?)|x — ul for x,u € [a, 00). 
3. (a) Let x, :=n+1/n, Un := n. 
(b) Let x, := 1/27, un := 1/(2na + 1/2). 


6. If M is a bound for both f and g on A, show that | f(x) g(x) —f(u)g(u)| < M|f(x)— f(w)|+ 
M|g(x) — g(u)| for all x,u € A. 


8. Given e > 0 there exists ôs > 0 such that |y — v| < êp implies | f(y) —f(v)| < £. Now choose 
ô; > 0 so that |x — u| < 5, implies |g(x) — g(u)| < dy. 
11. If |g(x) — g(0)| < K|x — 0| for all x € [0, 1], then yx < Kx for x € [0, 1]. But if x, := 1/n’, 
then K must satisfy n < K for all n € N, which is impossible. 
14. Since f is bounded on [0, p], it follows that it is bounded on R. Since f is continuous on 


J := [-1, p + 1], it is uniformly continuous on J. Now show that this implies that fis uniformly 
continuous on R. 


Section 5.5 
1. (a) The 6-intervals are [-4.4]. [5.4]. and Rgl. 
(b) The third ô-interval does not contain (, 1]. 
(a) Yes. (b) Yes. 


No. The first 52-interval is l-5. dl and does not contain [0, 1. 
(b) [fre (4,1) then [t— ô(4), t +80] = [—3 +345 +32] C Gj, 1) 
We could have two subintervals having c as a tag with one of them not contained in the 
6-interval around c. 

: a. hs, i 
7. EP := { (la, x1], ti),--- (Pea, cl, te), (lc, Xk+1], tk+1),- o (Pin b], "i is 6*-fine, then P := 
{({a, xi], t1),---, » (Pek 1, c], tc) } is a 5’-fine partition of la, c] and P” := = {([c, xcs], t) 
([Xn; b], tn) } is a 6”-fine partition of [c, b]. 


DN 0) lS 


9. The hypothesis that f is locally bounded presents us with a gauge ô. If {([xi-1, xi], ti) }7_1 is a 
6-fine partition of [a, b] and M; is a bound for |f| on [x;_1, x;] let M := sup{M; :i=1,...,n 


Section 5.6 
1. If x € a, b], then f(a) < f(x). 
4. IfO<f(x1) <f(x2) and 0 < g(x1) < g(x2), then f(x1)g(x1) < f(x2)8(x1) < f(x2)8(x2). 


If f is continuous at c, then lim(f(x,)) =f(c), since c= lim(x,). Conversely, since 
0 < jlc) < f(xXn) — f(X2n+1), it follows that jlc) = 0, so f is continuous at c. 
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7. Apply Exercises 2.4.4, 2.4.5, and the Principle of the Iterated Infima (analogous to the result in 
Exercise 2.4.12). 

8. Let x; E€ / be such that y = f(xı) and x2 € Z be such that y = g(x). If x2 < xı, then 
y = g(y2) <f(x2) < f(x1) = y, a contradiction. 

11. Note that f~" is continuous at every point of its domain [0, 1] U (2, 3]. 

14. Let y:=x!/" and z := x!/4 so that y" = x = z1, whence (by Exercise 2.1.26) y? = x? = 24. 
Since np = mq, show that (x!/ ny” = (x!/ a\P or x’””” = x”, Now consider the case where 
m,p E€ Z. 

15. Use the preceding exercise and Exercise 2.1.26. 

Section 6.1 

1. @) f(x) = lim[(x th) = x°] /h = lim (3x? +3xh + k) =3x, 
a h— 
Vv h— 1 1 
(©) h'(x) = lim a vx _ lim = f 
h—0 h h>0 J/x+th+t yx 2/x 
4. Note that |f(x)/x| < |x| for x € R. 
@ f(x)= (1-2/4) (b) g'(x) = (x—1)/V5—2x 42 
(©) h'(x) = mkx*! (cos x*) (sin a (d) k'(x) = 2x sec?(x?). 
6. The function f’ is continuous for n > 2 and is differentiable for n > 3. 
(a) f'(x) =2forx > 0, f'(x) = Ofor — 1 <x <0, andf’(x) = —2 forx < -1, 
(c) A(x) = 2|x| for all x € R. 
10. If x 40, then g'(x) = 2x sin(1/x*) — (2/x) cos(1/x?). Moreover, 
g(0) = lim sin(1/h") = 0. Consider x, := 1/V2nz. 

11. (a) f'(x) =2/(2x +3) (b) g'(x) = 6(L(x?))"/x 
© h'(x)=1/x (d k(x) = 1/(xL(x)) 

14. 1/h'(0) = 1/2, 1/A'(1) = 1/5, and 1/h!(—1) = 1/5. 

16. D[Arctan y] = 1/D/tan x] = 1/sec?x = 1/(1 + y?). 

Section 6.2 

1. (a) Increasing on [3/2,00), decreasing on (—oo, 3/2], 
(c) Increasing on (—oo, — 1] and [1, oo). 
2. (a) Relative minimum at x = 1; relative maximum at x = —1, 
(c) Relative maximum at x = 2/3. 
3. (a) Relative minima at x = +1; relative maxima at x = 0, +4, 
(c) Relative minima at x = —2,3; relative maximum at x = 2. 
6. If x< y there exists c in (x,y) such that |sin x — sin y| = |cos c||y — x]. 


9. f(x) =x*(2 +sin(1/x)) > 0 for x Æ 0, so f has an absolute minimum at x = 0. Show that 


10. 


14. 
17. 
20. 


f'(1/2nx) < 0 for n > 2 and f'(2/(4n + 1)z) > Oforn > 1. 

g' (0) = lim (1 + 2x sin(1/x)) =1+0=1, and if x#0, then g'(x) = 1+ 4xsin(1/x) — 
x 

2. cos(1/x). Now show that 9(1/2n7) < 0 and that we have g'(2/(4n + 1)z) > Oforn EN. 

Apply Darboux’s Theorem 6.2.12. 

Apply the Mean Value Theorem to the function g — f on [0, x]. 


(a, b) Apply the Mean Value Theorem. 
(c) Apply Darboux’s Theorem to the results of (a) and (b). 
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Section 6.3 

1. A=Bllimf(x)/s(x)) = 0. 

4. Note that f’(0) = 0, but that f'(x) does not exist if x 4 0. 

Iaa (b) 1 © 0 (d) 1/3. 
8. (a) 1 b) œ () 0 (d) 0. 
9. (a) 0 (b) 0 (c) 0 (d) 0. 
10. (a) 1 b) 1 O 2 (d) o. 
1. @ 1 (b) 1 (er 4 (d) 0. 
Section 6.4 


1. f(x) = (—1)"a2""! sin ax and f™” (x) = (—1)"a?"cos ax forn € N. 


4. Apply Taylor’s Theorem to f(x) := v1 + x at xo := 0 and note that R;(x) < 0 and Ro(x) > 0 
for x > 0. 


5. 1.095 < v1.2 < 1.1 and 1.375 < v2 < 1.5. 
6. R2(0.2) < 0.0005 and R2(1) < 0.0625. 
11. With n = 4, In 1.5 = 0.40; with n = 7, In 1.5 = 0.405. 
17. Apply Taylor’s Theorem to f at xọ = c to show that f(x) > f(c) + f'(e)(x— c). 
19. Sincef(2) < 0 and f(2.2) > 0, there is a zero of fin [2.0, 2.2]. The value of x4is approximately 


2.094 5515. 
20. rı œ 1.452 626 88 and r2 ~ —1.164 035 14. 21. ra 1.324717 96. 
22. rı & 0.158 594 34 and r2 ~ 3.146 193 22. 23. rı © 0.5andr ~ 0.809 016 99. 


24. r~ 0.739 085 13. 


Section 7.1 
1. (a) ||Pill=2 œ) |P = 2 (c) ||P3|| = 1.4 (d) ||Pal| = 2. 


2. (a) O-14+1°-14+27-2=0+1+4+8=9 
(b) 37 (c) 13 (d) 33. 
5. (a) If u€ [xj_1,x;], then x; < u so that cı < t; < x; < x;-1 +||P|| whence c; — ||P|| < 
x1 <u. Also u < x; so that x; — ||P|| < x1 < ti < c2, whence u < x; < & +||PII. 


10. g is not bounded. Take rational tags. 


12. Let P, be the partition of [0, 1] into n equal parts. If P, is this partition with rational tags, then 
S(f; Pn) = 1, while if Q, is this partition with irrational tags, then S(f; Qn) =0. 


13. If ||P|| < ô: := e/4q, then the union of the subintervals in P with tags in [c, d] contains the 
interval [c + 6,,d—6,] and is contained in [c—6,,d+6,]. Therefore a(d — c— 26,) < 
Slo; P) < a(d — c + 28:), whence |S(g : P) —a(d— c)| < 206, < E£. 

14. (b) Infact, le + XiXi—1 + ya) - (xi — Xi-1) = x? = Xap 
(c) The terms in S(O ; Q) telescope. 

15. Let P = {([xm1, xi], ti), be a tagged partition of [a, b] and let 


n 


O={([xi-1 +¢,x7+¢,t;+ 0}, so that Ò is a tagged partition of [a +c, b+c] and 
||| = ||P ||. Moreover, S(g; Q) = S(f;P) so that Is(e: Ò) - ferl = Isis?) - (P| <e 
when ||Ò]|| < ô+. 
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Section 7.2 


2. 


If the tags are all rational, then S (h; P) > 1, while if the tags are all irrational, then 
S (h; P) = 0. 


3. Let P, be the partition of [0, 1] into n equal subintervals with 1; = 1/n and Ò, be the same 
subintervals tagged by irrational points. 
5. Ifci,...,Cn are the distinct values taken by g, then g7! (c) is the union of a finite collection 
n j 
{Jj Ly Jin} of disjoint subintervals of [a, b]. We can write g = bp DD GANE 
j=l k=l 
6. Not necessarily. 
If f(c) > 0 for some c € (a, b), there exists 8 > 0 such that f(x) > 4f (c) for |x — c| < 5. Then 
SEF > SEES > (28) $f(c) > 0. If c is an endpoint, a similar argument applies. 
10. Use Bolzano’s Theorem 5.3.7. 
12. Indeed, |g(x)| < 1 and is continuous on every interval [c, 1] where 0 < c < 1. The preceding 
exercise applies. 
13. Let f(x) := 1/x for x € (0, 1] and f(0) := 0 
16. Let m := inf f(x) and M := supf. By Theorem 7.1.5(c), we have 
—a) < < — a). By Bolzano’s Theorem 5.3.7, there exists c € |a, b] such that 
b 
Ja f)/ 
19. Let P,, be a sequence of tagged partitions of [0, a] with ||P„|| > 0 and let P be the 
corresponding “symmetric” partition of [—a, a]. Show that S(f; P "y= 2S ( S; P n) > 
2 fof 
20. Note that x= f(x?) is an even continuous function. 
Section 7.3 
1. Suppose that E := {a = co < cı < +-+- < Cm = b} contains the points in [a, b] where the 
derivative F'(x) either does not exist, or does not equal f(x). Then f € R[ci-1,c;] and 
fë f =Fle :) — F(ci-1). Exercise 7.2.14 and Corollary 7.2.11 imply that f € R[a,b] and 
that ff = ye F(cj-1)) = F(b) — F(a). 
E=6. 3. Let E:= {-1, 1}. If x ¢ E, G(x) = g(x). 
4. Indeed, B’(x) = |x| for all x. 6. Fo=Fa— fif. 
Let 4 be Thomae’s function. There is no function H : [0, 1] — R such that H’ (x) = h(x) for xin 
some nondegenerate open interval; otherwise Darboux’s Theorem 6.2.12 would be contradicted 
on this interval. 
9. (a) G(x) = F(x) — F(c), (b) A(x) = F(b) — F(x), (c) S(x) = F(sin x) — F(x). 
10. Use Theorem 7.3.6 and the Chain Rule 6.1.6. 
11. (a) F(x) = 2x(1+ x5) (b) F(x) = (1+22)'? — 2x(1 + x4)”, 
15. g'(x) =f(x+c) —f(x—-c). 
18. (a) Take g(t) =1+?7 toget 4(27/? — 1). 
(b) Take g(t) =1+F toget 4. 
(c) Take g(t) = 1+ vt to get 4 (33/2 — 23/2). 
(d) Take g(t) = t!/? toget 2(sin2 — sin 1). 
19. In (a) — (c) g'(0) does not exist. For (a), F over [c, 4] and let c — 0+. For (c), the 


integrand is even so the integral equals 2 fa seg)" dt. 


384 HINTS FOR SELECTED EXERCISES 


20. (b) U,Z, is contained in Un, J% and the sum of the lengths of these intervals is < Sve [2° =6. 
21. (a) The Product Theorem 7.3.16 applies. ” 

(b) We have F2t f fe <? [Pf +f? 8’. 

(c) Let t — œin (b). 


D fO r= (J pEr in (b). 


22. Note that sgn o h is Dirichlet’s function, which is not Riemann integrable. 


Section 7.4 
2. Show that if P is any partition, then L(f; P) = U(f; P) = c(b — a). 
4. If k > 0, then inf{kf(x) : x € Ij} = kinf{f(x) : x € Jj}. 
6. Consider the partition P, := (0, 1 — €/2, 1 + £/2, 2). 
9. See Exercise 2.4.8. 


11. If |f(x)| < M for x € [a, b] and ¢ > 0, let P be a partition such that the total length of the 
subintervals that contain any of the given points is less than ¢/M. Then U(f; P) — L(f; P) < 
so that Theorem 7.4.8 applies. Also 0 < U(f; P) < e, so that U(f) = 0. 


Section 7.5 


1. Use (4) with n = 4, a = 1, b = 2, h = 1/4. Here 1/4 < f"(c) < 2, so T4 ~ 0.697 02. 
3. T, ~ 0.78279. 

4. The index n must satisfy 2/12n? < 1076; hence n > 1000/6 ~ 408.25. 

5. S4 ~ 0.78539. 

6. The index n must satisfy 96/180n* < 1076; hence n > 28. 


12. The integral is equal to the area of one quarter of the unit circle. The derivatives of h are 
unbounded on [0, 1]. Since h” (x) < 0, the inequality is T,(h) < 7/4 < M, (h). See Exercise 8. 


13. Interpret K as an area. Show that h” (x) = —(1 — 2? and that 
h(x) = —3(1 + 4x2)(1 — 22)”. To eight decimal places, 1 = 3.141 592 65. 


14. Approximately 3.653 484 49. 15. Approximately 4.821 159 32. 
16. Approximately 0.835 648 85. 17. Approximately 1.851 937 05. 
18. 1. 19. Approximately 1.198 140 23. 


20. Approximately 0.904 524 24. 


Section 8.1 
1. Note that 0 < f (x) < x/n > Oas n > œ. 
3. If x > 0, then |f, (x) — 1| < 1/(nx). 
5. If x > Othen|f,,(x)| < 1/(nx) > 0. 
7. Ifx>0, then0<e* <1. 
9. If x > 0, then 0 < x?e™* = x*(e-*)" — 0, since 0 < e™ < 1. 
10. If x €Z, the limit equals 1. If x ¢ Z, the limit equals 0. 
11. If x € (0, aj, then |f,,(x)| < a/n. However, f,,(n) = 1/2. 
14. If x€ [0, b], then | f,,(x)| < b”. However, f,,(2-'/") = 1/3. 
15. If x € [a, oo), then |f,,(x)| < 1/(na). However, f,,(1/n) = 4sin1 > 0. 


18. The maximum of f„ on [0, 00) is at x = 1/n, so || fallo, o) = 1/(ne)- 
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20. If 7 is sufficiently large, || fnl|ja,.0) = ra’ /e™. However, Fallo) = 4/e?. 
23. Let M be a bound for (f,,(x)) and (g,(x)) on A, whence also |f(x)| < M. The Triangle 
Inequality gives |f,(x)gu(%) -F| < M( ful) -FL gal) — gC) for x € A. 
Section 8.2 
The limit function is f(x) := 0 for O < x < 1, f(1) := 1/2, and f(x) := 1 for1 <x <2. 

4. If e>0 is given, let K be such that if n>K, then ||f, —f||,<¢/2. Then 
[Sn €n) —F(%0)| < Fann) =S n) + [FQin) — F(%0)| < €/2 + |Fin) —f(%0)|- Since f is 
continuous (by Theorem 8.2.2) and x, — xo, then |f(x;,) — f(xo)| < ¢/2 for n > K', so that 
lfn(X%n) —f(xo)| < ¢ for n > max{K, K'}. 

Here f(0) = 1 and f(x) = 0 for x € (0, 1]. The convergence is not uniform on [0, 1]. 
Given e := 1, there exists K > 0 such that if n > K and x € A, then | f,,(x) — f(x)| < 1, so that 
[Sa] < [SK| + 1 for all x € A. Let M := max{ || filly, +++ 11Fx-allas fella + 1- 

8. fill /Vn) = vn/2. 

10. Here (g,,) converges uniformly to the zero function. The sequence (8) does not converge 
uniformly. 

11. Use the Fundamental Theorem 7.3.1 and Theorem 8.2.4. 

13. Ifa > 0, then ||fnlliun] < 1/(7a) and Theorem 8.2.4 applies. 

15. Here ||gnllio, 1] < 1 for all n. Now apply Theorem 8.2.5. 

20. Let f,,(x) := x” on [0, 1). 

Section 8.3 

1. Let A := x > 0 and let m — œ in (5). For the upper estimate on e, take x = 1 and n = 3 to 
obtain |e — 22| < 1/12, soe < 23. 

2. Note that if n > 9, then 2/(n + 1)! < 6 x 1077 < 5 x 1076. Hence e ~ 2.71828. 

Evidently E, (x) < e* for x > 0. To obtain the other inequality, apply Taylor’s Theorem 6.4.1 to 
[0, al. 
Note that 0 < r’/(1 + £) < t’ fort € [0, x]. 
6. In 1.1 ~ 0.0953 and In 1.4 ~ 0.3365. Take n > 19,999. 

In 2 ~ 0.6931. 

10. L’(1) = lim|L(1 + 1/n) — L(1)]/(1/n) = lim L((1 + 1/n)") = L(lim(1 + 1/n)") = L(e) = 1. 

1 (©) (ay) = E(eL(xy)) = E(aL(x) + @L(y)) = E(aL(x)) - E(aL(y)) = xy" 

12. (b) (x)? = E(BL(x*)) = E(BoL(x)) = x°’, and similarly for (x*)”. 

15. Use 8.3.14 and 8.3.9(vii). 

17. Indeed, we have log, x = (Inx)/(Ina) = [(Inx)/(Inb)] - [(In b)/(ln a)] if a4 1, b #1. Now 
take a= 10, b=e. 

Section 8.4 

1. If n> 2|x|, then |cos x — C,(x)| < (16/15)|x|”/(2n)!, so cos(0.2) ~ 0.980067, cos 1 =~ 
0.549 302. Similarly, sin(0.2) ~ 0.198 669 and sin 1 ~ 0.841 471. 

4. We integrate 8.4.8(x) twice on [0, x]. Note that the polynomial on the left has a zero in the 


interval [1.56,1.57], so 1.56 < 7/2. 
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5. Exercise 8.4.4 shows that C4(x) < cosx < C3(x) for all x € R. Integrating several times, we 
get S4(x) < sinx < S5(x) for all x > 0. Show that S4(3.05) > 0 and S5(3.15) < 0. (This 
procedure can be sharpened.) 

6. If |x| <A and m>n > 2A, then |em(x) — en(x)| < (16/15)A?"/(2n)!, whence the conver- 
gence of (c,) to c is uniform on each interval [—A, A]. 

7. D{(c(x))? — (s(x))*] = 0 for all x € R. For uniqueness, argue as in 8.4.4. 

8. Letg(x) :=f(0)c(x) +f/(0)s(x) forx € R, sothatg”(x) = g(x), g(0) =f (0) andg’(0) = f' (0). 
Therefore h(x) :=f(x)— g(x) has the property that h(x) =h(x) for all x ¢R and 
h(0) = 0, h’(0) = 0. Thus g(x) = f(x) for all x € R, so that f(x) = f(0)c(x) + f'(0)s(x). 

9. If g(x) := c(—x), show that g” (x) = g(x) and g(0) = 1, g'(0) = 0, so that g(x) = c(x) for all 
x € R. Therefore c is even. 

Section 9.1 
[2,0] loo) 
1. Let's, be the nth partial sum of 5 an, let t, be the nth partial sum of 5 |a,|, and suppose that 
1 1 
an > Oforn > P.Ifm > n > P, show that tm — tn = Sm — Sn. Now apply the Cauchy Criterion. 

3. Take positive terms until the partial sum exceeds 1, then take negative terms until the partial sum 
is less than 1, then take positive terms until the partial sum exceeds 2, etc. 

5. Yes. 

6. Ifn > 2, then s, = —In2—Inn+In(n+ 1). Yes. 

We have Sy, — Sn > Nain = $ (2narn), and = Son41 — Sn > $(2n + 1)don41. Consequently 
lim(nan) = 0. 
11. Indeed, if |nan] < M for n, then |an| < M/n’. 
13. (a) Rationalize to obtain `x, where x, := [va(vn +1+ vn)] “| and note that 
Xn © Yn := 1/(2n). Now apply the Limit Comparison Test 3.7.8. 
(b) Rationalize and compare with X` 1/n?/?. 
14. If X` a, is absolutely convergent, the partial sums of ` |a| are bounded, say by M. Evidently 
the absolute value of the partial sums of any subseries of a, are also bounded by M. 
Conversely, if every subseries of )~ a, is convergent, then the subseries consisting of the 
strictly positive (and strictly negative) terms are absolutely convergent, whence it follows that 
X` an is absolutely convergent. 
Section 9.2 
1. (a) Convergent; compare with X` 1/ n. (c) Divergent; note that gin _, 1, 
2. (a) Divergent; apply 9.2.1 with b, := 1/n. 
(c) Convergent; use 9.2.4 and note that (n/(n + 1))" > 1/e < 1. 
3. (a) (Inn) <n for large n, by L’Hospital’s Rule. 
(c) Convergent; note that (Inn)'""” > n? for large n. 
(e) Divergent; apply 9.2.6 or Exercise 3.7.15. 
4. (a) Convergent (b) Divergent (c) Divergent 
(d) Convergent; note that (In n)exp(—n'/?) < nexp(—n!/?) < 1/7? for large n, by L’Hospital’s 
Rule. 
(e) Divergent (f) Divergent. 
Apply the Integral Test 9.2.6. 
7. (a,b) Convergent (c) Divergent (d) Convergent. 
9. Ifm>n> K, then |sm — sn| < |Xnai| ++ + [Xm] < "7/1 — r). Now let m — oo. 


12. 
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(a) A crude estimate of the remainder is given by s—s4 < ike x?dx = 1/5. Similarly 
s — sio < 1/1lands — s, < 1/(n + 1), so that 999 terms suffice to get s — so99 < 1/1000. 

(d) If n>4, then x,41/xX, < 5/8 so (by Exercise 10) |s— s4| < 5/12. If n > 10, then 
Xn+1/Xn < 11/20 so that |s—sio| < (10/2'°)(11/9) < 0.012. If n=14, then 
|s = Sya| < 0.000 99. 


[e0] (0,0] 
13. (b) Here 5 < iL x ?dx=2//n, so |s—sio| < 0.633 and |s—s,| < 0.001 
n+l n 
when n > 4 x 10°. 
(c) Ifn > 4, then |s — sa| < (0.694)x, so that |s — s4| < 0.065. If n > 10, then |s — Sn| < 
(0.628)x, so that |s — sio| < 0.000 023. 

14. Note that (s34) is not bounded. 

16. Note that, for an integer with n digits, there are 9 ways of picking the first digit and 10 ways of 
picking each of the other n — 1 digits. There is one value of m, from 1 to 9, there is one value 
from 10 to 19, one from 20 to 29, etc. 

18. Here lim(n(1 — xn41/Xn)) = (c — a — b) +1, so the series is convergent if c > a + b and is 
divergent if c < a + b. 

Section 9.3 

1. (a) Absolutely convergent (b) Conditionally convergent 
(c) Divergent (d) Conditionally convergent. 

2. Show by induction that s2 < s4 < 56 < -+-+ < S5 < 53 < s1. Hence the limit lies between s„ and 
Sn41 SO that |s — Sy] < |Sn41 — Sn] = Zn+1- 

5. Use Dirichlet’s Test with (y,,) := (+1, —1,—1, +, +1,—1,—1,...). Or group the terms in pairs 
(after the first) and use the Alternating Series Test 

7. Tf f(x) := (ln x) /x4, then f'(x) < 0 for x sufficiently large. L’Hospital’s Rule shows that the 
terms in the alternating series approach 0. 

8. (a) Convergent (b) Divergent (c) Divergent (d) Divergent. 

11. Dirichlet’s Test does not apply (directly, at least), since the partial sums of the series generated 
by (1,—1,—1, 1, 1, 1,...) are not bounded. 

15. (a) Use Abel’s Test with x, := 1/n. 

(b) Use the Cauchy Inequality with x, := Vän, Yn := 1/n, to get 
X Van/n< San)” (1n, establishing convergence. 

(d) Leta, := [n(Inn)*]~', which converges by the Integral Test. However, b, := [y/n In nl, 
which diverges. 

Section 9.4 

1. (a) Take M, := 1/n? in the Weierstrass M-Test. 
(c) Since |sin y| < |y|, the series converges for all x. But it is not uniformly convergent on R. If 
a > 0, the series is uniformly convergent for |x| < a. 
(d) If0 < x< 1, the series is divergent. If 1 < x < on, the series is convergent. It is uniformly 
convergent on [a, oo) for a > 1. However, it is not uniformly convergent on (1, o0). 
4. If p = œ, then the sequence (|an| Mt ”) is not bounded. Hence if |xo| > 0, then there are infinitely 
many k € N with |a,|!/* > 1/|xo| so that |aixt| > 1. Thus the series is not convergent when 
Xo # 0. 
5. Suppose that L := lim(|an|/|an+1|) exists and that 0 < L < oo. It follows from the Ratio Test 


that X` anx” converges for |x| < L and diverges for |x| > L. The Cauchy-Hadamard Theorem 
implies that L = R. 
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6. (a) R=co b) R=c 
(c) R= l1/e (d) 1 
e) R=4 (f) R=1. 


8. Use lim(n!/”) =f; 
10. By the Uniqueness Theorem 9.4.13, a, = (—1)"a, for all n. 
12. If n €N, there exists a polynomial P, such that f(x) = e~!/" P,(1/x) for x 0. 
13. Let g(x) := 0 for x > 0 and g(x) := e~!/” for x < 0. Show that g™ (0) = 0 for all n. 


16. Substitute —y for x in Exercise 15 and integrate from y = 0 to y = x for |x| < 1, which is 
justified by Theorem 9.4.11. 

19. ee 5 (=1)" x”! /nl(2n + 1) for x € R. 

mx 1-3-5---(2n— 1) 

2 2-4-6--2n ` 


n=0 
20. Apply Exercise 14 and Ge (sin x)7"dx = 


Section 10.1 


1. (a) Since t; — 5(t;) < xj-1 and x; < t; + d(t;), then 0 < x; — xi-1 < 28(t;). 
(b) Apply (a) to each subinterval. 


2. (b) Consider the tagged partition {({0, 1], 1), ([1, 2], 1), (12, 3],3), ([3, 4], 3)}- 


3. (a) IfP= {([xi-1, xi], ti) }7_1 and if tx is a tag for both subintervals [xx—1, xx] and [xk, Xxk+1], 
we must have fk = x;. We replace these two subintervals by the subinterval [xx_1, Xk+1] 
with the tag t, keeping the 6-fineness property. 

(b) No. 
(c) If te € (xk-1, Xk), then we replace [x,_1, xx] by the two intervals [x,_, tg] and [t,, xk] 
both tagged by tę, keeping the 6-fineness property. 


4. If xı < 1 < x, and if tp is the tag for [x,_1,x,], then we cannot have ft, > 1, since then 
tk — ôt) = 5 (tk +1)> 1. Similarly, we cannot have % <1, since then tę + 8(t,) = 
5 (th +1) < 1. Therefore tk = 1. 

5. (a) Let 8(¢) := 4min{|¢— 1|, |t — 2), |t — 3|} if £ # 1,2,3 and 4(¢) := 1 for t= 1, 2, 3. 
(b) Let 52(t) := min{68(t), 5)(t)}, where ô is as in part (a). 


7. @ MG) := (2/3)x/2 +2x"2, 


b) F(x) := (2/3)(1 — x)? — 201 — x)”, 

(c) F3(x) := (2/3)x3/2(In x — 2/3) for x € (0, 1] and F3(0) := 0, 
(d) F(x) := 2x!/?(In x — 2) for x € (0, 1] and F4(0) := 0, 

(e) Fs(x) := —V1— x? + Arcsin x, 


(f) F(x) := Arcsin(x — 1). 


8. The tagged partition P, need not be 6,-fine, since the value 5<(Z) may be much smaller than 
5: (24). 
9. If f were integrable, then hf > iA Sn =1/2+1/3+---+1/(n+ 1). 


10. We enumerate the nonzero rational numbers as rg =mk/nęķg and define 6.(m,/ng) := 
é/(m2**!) and 6,(x) := 1 otherwise. 

12. The function M is not continuous on [—2, 2]. 

13. Ly is continuous and L (x) = lı (x) for x 4 0, so Theorem 10.1.9 applies. 

15. We have C| (x) = (3/2)x!/?cos(1/x) + x~!/?sin(1/x) for x > 0. Since the first term in C’, has 
a continuous extension to [0, 1], it is integrable. 


16. We have C(x) = cos(1/x) + (1/x) sin(1/x) for x > 0. By the analogue of Exercise 7.2.12, 
the first term belongs to R[0, 1]. 

17. (a) Take g(t) := P+t—2s0E, = Ú to get 6. 
(b) Take g(t) := yt so Ep = {0} to get 2(2 + In3). 
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(c) Take g(t) := Vt — 1 so Ey = {1} to get 2 Arctan 2. 
(d) Take g(t) := Arcsin ft so Ey = {1} to get 1x. 


19. (a) In fact f(x) := F'(x) = cos(x/x) + (2/x) sin(/x) forx >0. We set (0) := 
F'(0) := 0. Note that f is continuous on (0, 1]. 
(b) F(a) = Oand F(b) = (—1)*/k. Apply Theorem 10.1.9. 
(c) If |f| € R*[0, 1, ten Dies Y f" if| <f | f| for alln € N. 

20. Indeed, sgn(f(x)) = (—1)* = m(x) on lax, by] so m(x) - f(x) = |m(x) f(x)| for x € [0, 1]. 
Since the restrictions of m and |m| to every interval [c, 1] for 0 < c < 1 are step functions, 
they belong to R[c, 1]. By Exercise 7.2.11, m and |m| belong to R/O, 1] and 

o0 1 oo 
Jo m= >> (-1*/k(2k + 1) and i Im = XD 1/k(2k + 1). 
k=1 0 k=1 

21. Indeed, g(x) = ®'(x) = |cos(x/x)| + (2/x)sin(x/x) - sgn(cos(7x/x)) for x ¢ E by Example 
6.1.7(c). Evidently ø is not bounded near 0. If x € [ax, bk], then g(x) = |cos(z/x)| + 
(x¢/x)|sin(z/x)| so that i lo] = B(db,) — D(a) = 1/k, whence |g| € R*(0, 1]. 

22. Here w(x) = W(x) = 2x|cos(2/x)| + 7 sin(z/x) - sgn(cos(z/x)) for x ¢ {0} U E; by Exam- 
ple 6.1. 7@). Since w is bounded, Exercise 7.2.11 applies. We cannot apply Theorem 7.3.1 to 
evaluate tis w since E is not finite, but Theorem 10.1.9 applies and w € R[O, 1]. Corollary 7.3.15 
implies that |y| € R[0, 1]. 

23. If p > 0, then mp < fp < Mp, where m and M denote the infimum and the supremum of f on 
[a, b], so that mfp < ? fp < M f° p. If [ep = 0, the result is trivial; otherwise, the 
conclusion follows from Bolzano’s Intermediate Value Theorem 5.3.7. 

24. By the Multiplication Theorem 10.1.14, fg € R*[a, 5]. If g is increasing, then g(a)f < fg < 

b > b : 
g(b)f so that g(a yi? f< fPfe< < g(b) f, f. Let K(x) := g(a) [*f + 8(b) ff, so that K is 
continuous and takes all values between K(b) and K(a). 
Section 10.2 

2. (a) If G(x) := 3x" for x € [0, 1] then f! g = G(1) — G(c) > G(1) =3. 
(b) We have ji (1/x)dx = lnc, which does not have a limit in R as c > 0. 

3. Here (asx dx? 2(1—c)'? > 2asc > 1 

5. Because of continuity, g, € R*[c, 1] forall c € (0, 1). If œ@(x) := x7!/?, then |g; (x)| < (x) for 
all x € [0, 1]. The “left version” of the preceding exercise implies that g} € R*[0, 1] and the 
above inequality and the Comparison Test 10.2.4 imply that g4 € £[0, 1]. 

6. (a) The function is bounded on [0, 1] (use I’ Hospital) and continuous in (0, 1). 
(c) Ifxe (0 ; 4), the nee is dominated by |( (In 4 )In )In x]. If x € H, 1), the integrand is 

dominated by |( (In $)In(1 — x)|. 

7. (a) Convergent (b, c) Divergent (d, e) Convergent (f) Divergent. 

10. By the Multiplication Theorem 10.1.14, fg € R*[a, b]. Since | f(x)g(x)| < Bl f(x)|, then fg € 
Cia, b] and | fall < BIlfll 

11. (a) Let f(x) := (—1)*2*/k for x € [ex-1, ck) and f(1) := 0, where the cy are as in Example 

10.2.2(a). Then f* := max{f, 0} ¢ R*[0, 1]. 
(b) Use the first formula in the proof of Theorem 10.2.7. 
13. (ii) If f(x) = g(x) for all x € [a,b], then dist(f, g =f |f — g| =0. 
Gii) dist(f, g) =f pals Se lg -f| = dist(g, à; 
(Giv) dist(f,h) =f |f- h| < j? |f- g+ f |e — h| = dist(f, g) + dist(g, h). 
16. If(f,,) converges tofin £ [a, b], given e > 0 there exists K(¢/2) such that if m, n > K(e/2) then 


lfm FI </2 and || fa —fl| <¢/2. Therefore ||Fm = fall < lfm =F E= fall < 
€/2 + 6/2 = e. Thus we may take H(e) := K(e/2). 
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18. If m>n, then ||g,, — g,|| < 1/n + 1/m —> 0. One can take g := sgn. 
19. No. 
20. We can take k to be the 0-function. 


Section 10.3 


1. Leth > max{a, 1/5(0o)}. If È is a ô -fine partition of [a, b], show that È is a 6-fine subpartition 
of [a, 00). 

3. Iff € Lia, oo), apply the preceding exercise to |f|. Conversely, iff |f| < efor q >p > K(e), 
then | [7 f— f? f| < uf |f| <e so both lim, f”f and lim, f” |f| exist; therefore f,|f| € 
R*[a, oo) and so f € Lia, ov). 

5. Iff,g € Lla, œ), then f, |f|, g, and |g| belong to R*|a, co), so Example 10.3.3(a) implies that 
f-+g and |f|+|g| belong to R'a, o0) and that f° (Ifl-+lel) = [Lf] + fZ lel. Since 
If+ 8l <I fltlel it follows thatf” If + sl < SYI + S? lel < SÈ IfI + SS lel, whence 
lf + sll < IAI + Iall. 


6. Indeed, f” (1/x) dx = In y, which does not have a limit as y — oo. Or, use Exercise 2 and the 
fact that iP (1/x)dx = ln2 > Oforallp > 1. 
If y > 0, then fg cos xdx = sin y, which does not have a limit as y — oo. 
9. (a) We have fe “dx = (1/s)(1 — e7) > 1/s. 
(b) Let G(x) := —(1/s)e~** for x € [0, 00), so G is continuous on [0, 00) and G(x) — 0 as 
x — oo. By the Fundamental Theorem 10.3.5, we have fọ g = —G(0) = 1/s. 
12. (a) If x > e, then (Inx)/x > 1/x. 
(b) Integrate by parts on [1, y] and then let y — oo. 
13. (a) |sinx| > 1/V2 > 1/2 and 1/x > 1/(n+ 1)m for x € (nt + 7/4, nw + 37/4). 
b) If y > (n+ 1)z, then Ñ |D| > (1/4)(1/1 + 1/2+---+1/(n+ 1)). 
15. Let u = g(x) = x°. Now apply Exercise 14. 


16. (a) Convergent (b, c) Divergent (d) Convergent (e) Divergent 
(f) Convergent. 
17. (a) If f,(x) :=sinx, then f; ¢ R*[0,00). In Exercise 14, take f,(x) := x7! sinx and 
p(x) = 1/yx. 
(c) Take f(x) := x~!/?sin x, and g(x) := (x + 1)/x. 
18. (a) f(x) :=sinx is in R*[0, y], and F(x) := fọ sin tdt = 1 — cos x is bounded on (0, ov), 
and g(x) := 1/x decreases monotonely to 0. 
(c) F(x) := Jj cos tdt = sin x is bounded on [0, 00) and g(x) := x7!/? decreases monot- 
onely to 0. 
19. Let u = (x) := x’. 
20. (a) If y>0, then f% e™*dx=1-— e — 1, so e* € R*[0,00). Similarly eh! = 
e“ € R*(—œ, 0]. 
(c) O0<e™ < e™ for |x| > 1, soe € R*[0, 00). Similarly on (—o0, 0). 


Section 10.4 


1. (a) Converges to 0 at x = 0, to 1 on (0, 1]. Not uniform. Bounded by 1. Increasing. Limit = 1. 
(c) Converges to 1 on [0, 1), to 5 at x = 1. Not uniform. Bounded by 1. Increasing. Limit = 1. 


2. (a) Converges to \/x on [0, 1]. Uniform. Bounded by 1. Increasing. Limit = 2/3. 
(c) Converges to 5 at x = 1, to 0 on (1, 2]. Not uniform. Bounded by 1. Decreasing. 
Limit = 0. 


10. 


12. 


13. 


14. 


15. 


16. 
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(a) Converges to 1 at x = 0, to 0 on (0, 1]. Not uniform. Bounded by 1. Decreasing. 
Limit = 0. 

(c) Converges to 0. Not uniform. Bounded by 1/e. Not monotone. Limit = 0. 

(e) Converges to 0. Not uniform. Bounded by 1/./2e. Not monotone. Limit = 0. 


(a) The Dominated Convergence Theorem applies. 

(b) f(x) — 0 for x € [0, 1), but (f;,(1)) is not bounded. No obvious dominating function. 
Integrate by parts and use (a). The result shows that the Dominated Convergence Theorem 
does not apply. 


Suppose that (f;(c)) converges for some c€ [a, b]. By the Fundamental Theorem, 
fi(x) — fele) = f> fr By the Dominated Convergence Theorem, ffy > f> g, whence 


(f (x)) converges for all x € [a, b]. Note that if f(x) :=(—1)*, then (f,(x)) does not 
converge for any x € [a,b]. 


Indeed, g(x) := sup{ f(x) : k € N} equals 1/k on (k — 1, k], so that fo g=14+54+---+4. 
Hence g ¢ R*[0, 00). 


(a) If a> 0, then |(e~sin x)/x| < e~ fort € Ja := (a, co). If tk € Ja and th > to E€ Ja, 
then the argument in 10.4.6(d) shows that E is continuous at fo. Also, if tk > 1, then 
|(e~**sin x) /x| < e~* and the Dominated Convergence Theorem implies that E(t,) — 0. 
Thus E(t) > 0 as t > œ. 

(b) It follows as in 10.4.6(e) that E'(to) = — fo e~* sinx dx = —1/(t +1). 

(©) By 10.1.9, E(s)— E(t) = fPE(Hdt=— ff (? 4 1) | dt = Arctan t — Arctans for 
s,t > 0. But E(s) — 0 and Arctan s — 2/2 as s —> ov. 

(d) We do not know that E is continuous as t > 0+. 


Fix x € I. As in 10.4.6(e), if t, to € a, b], there exists tx between f, fo such that f(t, x)— 
f(to, x) = (t — to) £ (tx, x). Therefore a(x) < [f(t,x) —f(to,x)|/(t— to) < @(x) when 
t Æ to. Now argue as before and use the Dominated Convergence Theorem 10.4.5. 


(a) If (sọ) is a sequence of step functions converging to f a.e., and (tk) is a sequence of step 
functions converging to g a.e., Theorem 10.4.9(a) and Exercise 2.2.18 imply that 
(max{sķx, tk}) is a sequence of step functions that converges to max{ f, g} a.e. Similarly, 
for min{f, g}. 


(a) Since f E€ M[a,b] is bounded, it belongs to R*[a, b]. The Dominated Convergence 
Theorem implies that f € R*[a, b]. The Measurability Theorem 10.4.11 now implies that 
f € M{a, 6]. 

(b) Since t+ Arctan?t is continuous, Theorem 10.4.9(b) implies that f;, := Arctano g, € 
Mla, b]. Further, | f;,(x)| < 52 for x € [a, b]. 

(c) If g, — g a.e., it follows from the continuity of Arctan that fp — f a.e. Parts (a, b) 
imply that f € M[a,b] and Theorem 10.4.9(b) applied to ø= tan implies that 
g=tanof € Mia, b]. 


(a) Since 1g is bounded, it is in R*[a, b] if and only if it is in M{[a, b]. 

(co) pe te 

(d) Igur(x) = max{1g(x), Ir(x)} and Ignr(x) = min{1g(x), Ir(x)}. Further, E\F = 
ENF’. 

(e) If (Ex) is an increasing sequence in Mja, b], then (1g,) is an increasing sequence in 
Mla, b] with 1z(x) = lim 1g, (x), and we can apply Theorem 10.4.9(c). Similarly, (1, ) is 
a decreasing sequence in Mla, b] and 17 (x) = lim 1,, (x). 

(f) Let A, := Ug_,Ex, so that (A,) is an increasing sequence in M[a, b] with UX A, = E, so 
(e) applies. Similarly, if B, := M_,Fx, then (By) is a decreasing sequence in Mla, b] 
with N° Bn, = F. 

(a) m(0) = fo = QandO < Ig < 1 implies 0 < m(E) = sts Ip <b-a. 

(b) Since 1, a) is a step function, then m(|c, d]) = d — c. 

(c) Since 1p = 1— 1g, we have m(E’) = f? (1 — 1x) = (b — a) — m(E). 

(d) Note that leur + lenr = 1; + 1r. 
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(f) If (Ex) is increasing in Mla, b] to E, then (1g,) is increasing in M[a,b] to 1g. The 
Monotone Convergence Theorem 10.4.4 applies: 

(g) If (Ck) is pairwise disjoint and E,, k- Ck for neN, then m(E,) = m(C,) 
+--+ m(Cn). Since UŠ; Ck = Ure Ey a (E,) is increasing, (f) implies that 


CoO 


m(Ue, Ck) = lim»m(En) = lim, Sa (Ck) = Xom 
k=1 


n=1 


Section 11.1 
1. If |x —u| < inf{x, 1 — x}, then u < x+ (1 — x) = 1 and u > x — x = 0, so that 0 < u < 1. 


3. Since the union of two open sets is open, then Gi U---U Gk U Gyi = (Gi U---U Gk) U 
Gk+1 is open. 


The complement of N is the union (—oo, 1) U (1,2) U--- of open intervals. 
7. Corollary 2.4.9 implies that every neighborhood of x in Q contains a point not in Q. 


10. x is a boundary point of A> every neighborhood V of x contains points in A and points in 
C(a)<=>x is a boundary point of C(a). 


12. The sets F and C(F) have the same boundary points. Therefore F contains all of its boundary 
points <=> C (F) does not contain any of its boundary points <=> C (F) is open. 


13. x € A° = x belongs to an open set V C A= x is an interior point of A. 


15. Since A” is the intersection of all closed sets containing A, then by 11.1.5(a) it is a closed set 
containing A. Since C(A~) is open, then z € C(A~)<= =z has a neighborhood V,(z) in 
C(A~) <=> z is neither an interior point nor a boundary point of A. 


19. If G Æ is open and x € G, then there exists e > 0 such that V(x) C G, whence it follows that 
a:= x — € is in Åy. 


21. If a< y< x then since a, := inf A, there exists a’ € A, such that a, <a’ < y. Therefore 
(y, x] C (a’,x] C Gandy € G. 


23. Ifx € F andn € N, the interval Z, in F, containing x has length 1/3”. Let y„ be an endpoint of J, 
with y, # x. Then y, € F (why?) and y, > x. 


24. As in the preceding exercise, take z, to be the midpoint of /,,. Then z, ¢ F (why?) and z, > x. 


Section 11.2 
1. Let G, := (14+ 1/n, 3) forn € N. 
3. Let Gn := (1/2n,2) forn EN. 


5. If G; is an open cover of K; and G2 is an open cover of K2, then G1 U G2 is an open cover of 
K, UK». 


7. Let K, := [0, n] forn € N. 


10. Since K # @ is bounded, it follows that inf K exists in R. If K, := {k € K : k < (inf K) + 1/n}, 
then K„ is closed and bounded, hence compact. By the preceding exercise NK, 4 0, but if 
Xo E€ NK,, then xo € K anditis readily seen that x» = inf K. [Alternatively, use Theorem 11.2.6.] 


12. Let 04K CR be compact and let cE R. If ne N, there exists x, € K such that 
sup{|c — x| : x € K} — 1/n < |e — x,|. Now apply the Bolzano-Weierstrass Theorem. 


15. Let Fy := {n:n € N} and Fp := {n+ 1/n:neEN,n> 2}. 


Section 11.3 


1. (a) See ae ae 1(I) = Ø. Ifa < 0 < b, then f7! (I) = (—Vb, Vb). 1f0 < a < b then 
f) = (-vb, -va) U (va, vb). 


$0.20 BD 
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F (G) =f (0, 8)) = [1,1 +) = 0, 1+) 01. 
Let G := (1/2, 3/2). Let F := [-1/2, 1/2]. 
Let f be the Dirichlet Discontinuous Function. 


First note that if A C R and x € R, then we have x € f~'(R\A) f(x) e R\A f(x) ¢ 
A <= x ¢ f7! (A) = x € RVT! (A); therefore, f7! (R\A) = R\ £71 (A). Now use the fact that a 
set F C R is closed if and only if R\F is open, together with Corollary 11.3.3. 


Section 11.4 


1. 


10. 


11. 


If Pi = (xi, y;) for i= 1,2, 3, then d (Pi, P2) < (|x x3| |x3 AD (ly y3| H 
ly3 — y2|) = dı (Pi, P3) + dı(P3, P2). Thus d; satisfies the Triangle Inequality. 

Since | f(x) — @(2)| < |F) —ACx)| + AC) — g(x)] < dolf, h) + d(h, g) for all x € [0,1], 
it follows that d.(f,g) < dæœ(f,h) + dx(h,g) and dæ satisfies the Triangle Inequality. 


We have s Æ t if and only if d(s, t) = 1. If s # t, the value of d(s, u) + d(u, t) is either 1 or 2 
depending on whether u equals s or t, or neither. 


Since dx(Pn,P) = sup{|x, — x|, |yn — y|}, if dxo(Pn,P) +0 then it follows that both 
|x, — x| > 0 and |y, — y| > 0, whence x, — x and y, — y. Conversely, if x, — x and 
Yn — y, then |x, — x| > 0 and |y, — y| — 0, whence dx.(Pn,P) > 0. 


If a sequence (xn) in S converges to x relative to the discrete metric d, then d(x, x) — 0, which 
implies that x, = x for all sufficiently large n. The converse is trivial. 


Show that a set consisting of a single point is open. Then it follows that every set is an open set, 
so that every set is also a closed set. (Why?) 


Let G C Sz be open in (S2, d2) and let x € f~'(G) so that f(x) € G. Then there exists an 
e-neighborhood V,(f(x)) C G. Since fis continuous at x, there exists a 8-neighborhood V(x) 
such that f(Vs(x)) C V,( f(x). Since x € f7! (G) is arbitrary, we conclude that f~! (G) is open in 
(S1, dı). The proof of the converse is similar. 


Let G = {Gy} be a cover of f(S) C R by open sets in R. It follows from 11.4.11 that each set 
f~ (Ga) is open in (S, d). Therefore, the collection {f~'(G,.)} is an open cover of S. Since 
(S, d) is compact, a finite subcollection {f ~! (Ga, ), - - -f~ (Gay) } covers S, whence it follows 
that the sets {Gy,,..., Gay } must form a finite subcover of G for f(S). Since G was an arbitrary 
open cover of f(S), we conclude that f(S) is compact. 
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Contrapositive, 351 

proof by, 355 
Convergence: 

absolute, 267 

of integrals, 315 ff. 

interval of, 283 

in a metric space, 343 

pointwise, 241 

radius of, 283 

of a sequence, 56 

of a sequence of functions, 

281 

of a series, 94 

of a series of functions, 281 

uniform, 281 
Converse, 351 
Convex function, 192 ff. 
Cosine function, 263 
Countability: 

of N x N, 18, 358 

of Q, 19 

of Z, 18 
Countable: 

additivity, 325 

set, 17 ff. 
Counter-example, 353 
Cover, 333 
Curve, space-filling, 368 
Cyclops, 76 


D 
D’ Alembert’s Ratio Test, 272 
Darboux, Gaston, 225 
Darboux Intermediate Value Theorem, 
178 
Integral, 228 ff. 
Decimal representation, 51 
periodic, 51 
Decreasing function, 153, 174 
sequence, 71 
DeMorgan’s Laws, 3, 350 
Density Theorem, 44 
Denumerable set (see also countable set), 
17 
Derivative, 162 ff. 
higher order, 188 
second, 188 
Descartes, René, 161 
Difference: 
symmetric, 11 
of two functions, 111 
of two sequences, 61 
Differentiable function, 162 
uniformly, 180 
Differentiation Theorem, 285 
Dini, Ulisse, 252 
Dini’s Theorem, 252 
Direct image, 6 
proof, 354 
Dirichlet discontinuous function, 127, 207, 
209, 221, 291, 321 
integral, 311, 320 
test, 279 
Discontinuity Criterion, 126 
Discrete metric, 343 
Disjoint sets, 3 
Disjunction, 349 
Distance, 34, 306 
Divergence: 
of a function, 105, 108 
of a sequence, 56, 80, 91 ff. 
Division, in R, 25 
Domain of a function, 5 
Dominated Convergence Theorem, 318 
Double implication, 351 
negation, 349 


E 
Element, of a set, 1 
Elliptic integral, 287 


Empty set 0, 3 

Endpoints of intervals, 46 

Equi-integrability, 316 

Equivalence, logical, 349 

Euler, Leonhard, 76 

Euler’s constant, 276 
number e, 75, 255 

Even function, 171, 216 
number, 2 

Excluded middle, 349 

Existential quantifier J, 352 

Exponential function, 253 ff. 

Exponents, 25 

Extension of a function, 144 ff. 

Extremum, absolute, 135 
relative, 172, 175, 191 


F 
F (= Cantor set), 331 
Falsity, 349 
Fermat, Pierre de, 161, 198 
Fibonacci sequence, 56, 89 
Field, 24 
6-Fine partition, 149, 289 
Finite set, 16 ff. 
First Derivative Test, 175 
Fluxions, 161 
Fresnel Integral, 314 
Function(s), 5 
additive, 116, 134, 156 
Bessel, 173 
bijective, 8 
bounded, 41, 111, 134 
composition of, 9, 133 
continuous, 125 ff., 337 ff. 
convex, 192 ff. 
decreasing, 153, 174 
derivative of, 162 
difference of, 111 
differentiable, 162 
direct image of, 6 


Dirichlet, 127, 207, 209, 221, 279, 291, 


321 
discontinuous, 125 
domain of, 5 
even, 171, 216 
exponential, 255 ff. 
gauge, 149 
graph of, 5 
greatest integer, 129 


INDEX 


hyperbolic, 266 

image of, 5 

increasing, 153, 174 

injective, 7 

integrable, 201, 290 

inverse, 7, 156, 168 

inverse cosine, 10 

inverse image of, 7 

inverse sine, 10 

jump of, 155 

limit of, 104 ff. 

Lipschitz, 143 

logarithm, 257 ff. 

measurable, 320 

metric, 342 

monotone, 153 

multiple of, 111 

nondifferentiable, 163, 367 

nth root, 157 

odd, 171, 216 

one-one, 7 

onto, 7 

oscillation, 361 

periodic, 148 

piecewise linear, 147 

polynomial, 113, 131, 148 

power, 159, 258 

product of, 111 

quotient of, 111 

range of, 5 

rational, 131 

rational power, 159 

restriction of, 10 

sequence of, 241 ff. 

series of, 281 ff. 

signum, 109, 127 

square root, 10, 43 

step, 145, 210 

sum of, 111 

surjective, 7 

Thomae’s, 128, 206, 222 

Translate, 207 

trigonometric, 131, 260 ff. 

values of, 5 

Fundamental Theorems of Calculus, 

216 ff., 295, 297 


G 
Gallus gallus, 349 
Gauge, 149 ff., 252, 289 ff. 
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Generalized Riemann integral, 290 ff. 
Geometric Mean, 29, 260 
series, 95 
Global Continuity Theorem, 338, 346 
Graph, 5 
Greatest integer function, 129, 224 
lower bound (= infimum), 37 


H 
Hadamard-Cauchy Theorem, 283 
Hake’s Theorem, 302, 309 
Half-closed interval, 46 
Half-open interval, 46 
Harmonic series, 88, 96, 267 
Heine-Borel Theorem, 335 
Henstock, Ralph, 289 
Higher order derivatives, 188 
Horizontal Line Tests, 8 
Hyperbolic functions, 266 
Hypergeometric series, 277 
Hypothesis, 350 

induction, 13 


I 
Image, 6 
Implication, 350 
Improper integrals, 272, 302 ff. 
Increasing function, 153, 174 
sequence, 71 
Indefinite integral, 218 
Indeterminate forms, 181 
Indirect proofs, 355 
Induction, Mathematical, 12 ff. 
Inequality: 
Arithmetic-Geometric, 29 
Bernoulli, 30, 177 
Schwarz, 225 
Triangle, 32, 342 
Infimum, 37 
Infinite limits, 119 
series, 94 ff., 267 ff. 
set, 16 ff. 
Injection, 7 
Injective function, 7 
Integers, 2 
Integral: 
Darboux, 228 ff. 
Dirichlet, 311, 320 
elliptic, 287 
Fresnel, 314 


generalized Riemann, 290 ff. 
improper, 272, 302 ff. 
indefinite, 218 
Lebesgue, 289, 304, 362 
lower, 227 
Riemann, 201 ff. 
Test, for series, 273 
upper, 227 
Integration by parts, 222, 299 
Interchange Theorems: 
relating to continuity, 248 
relating to differentiation, 249 
relating to integration, 250, 315 ff. 
relating to sequences, 247 ff. 
relating to series, 282 
Interior Extremum Theorem, 172 
of a set, 332 
point, 332 
Intermediate Value Theorems: 
Bolzano’s, 138 
Darboux’s, 178 
Intersection of sets, 3 
Interval(s), 46 ff. 
characterization of, 47 
of convergence, 283 
length of, 46 
nested, 47 ff. 
partition of, 149, 199 
Preservation of, 139 
Inverse function, 8, 156, 169 
image, 7 
Irrational number, 25 
Iterated sums, 270 
suprema, 46 


J 


Jump, of a function, 155 


K 

K(e)-game, 58 

Kuala Lumpur, 349 
Kurzweil, Jaroslav, 289 


L 
Lagrange, J.-L., 188 
form of remainder, 190 
Least upper bound (= supremum), 37 
Lebesgue, Henri, 198, 220, 288, 362 
Dominated Convergence Theorem, 
318 


Integrability Theorem, 221, 362 
integral, 198, 289, 304 
measure, 325 
Leibniz, Gottfried, 103, 161, 198 
Alternating Series Test, 278 
Rule, 196 
Lemma, 355 
Length, of an interval, 46 
L Hospital, G. F., 180 
Rules, 182 ff. 
Limit: 
Comparison Test, 99, 271 
of a function, 104 ff. 
inferior, 82 ff. 
infinite, 120 
one-sided, 117 
of a sequence, 56 
of a series, 94 
superior, 83 ff., 283 
Line tests, 8 
Lipschitz condition, 143 
Location of Roots Theorem, 137 
Logarithm, 257 ff. 
Logical equivalence, 349 
Lower bound, 37 
Integral, 227 
Sum, 226 


M 
M (= collection of measurable sets), 325 
M-Test, of Weierstrass, 282 
Mapping, see Function 
Mathematical Induction, 12 ff. 
Maximum, absolute, 135 
relative, 172 
Maximum-Minimum Theorem, 136, 
152, 339 
Mean Value Theorem: 

Cauchy form, 182 

for derivatives, 173 ff. 

for integrals, 215, 301 
Measurability Theorem, 322 
Measurable function, 320 
set, 325 
Measure, Lebesgue, 320 
zero, see Null set 
Meat grinder, 6 
Member of a set, 1 
Mesh (= norm) of a partition, 200 
Metric function, 342 


INDEX 


space, 341 ff. 
Middle, excluded, 349 
Midpoint Rule, 237 
Minimum, absolute, 135 

relative, 172 
Monotone Convergence Theorem, 

71,317 

function, 153 

sequence, 71 

Subsequence Theorem, 80 
Multiple of a sequence, 63 
Multiplication Theorem, 299 


N 
N (= collection of natural numbers), 
2 

Natural numbers, 2 

Negation, 349 

Negative numbers, 26 
Neighborhood, 35, 326, 343 
Nested Intervals Property, 48, 85 
Newton, Isaac, 102, 161, 198 
Newton-Leibniz Formula, 288 
Newton’s Method, 193 ff. 
Nondifferentiable functions, 163, 
367 

Norm of a function, 244 

of a partition, 200 

Null set, 220 

Number(s): 

even, 2, 26 

irrational, 25 

natural, 2 

odd, 2, 26 

rational, 25 

real, 2, 23 ff. 


(0) 
Odd function, 171,216 

number, 2, 26 
One-one function, 7 
One-sided limit, 117 
Onto, 7 
Open cover, 333 

interval, 46 

set, 327, 345 

Set Properties, 327, 345 
Order Properties of R, 26 ff. 
Ordered pair, 4 
Oscillation, 361 
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P (= positive class), 26 
Partial sum, 94, 281 
summation formula, 278 
Partition, 149, 199 
6-fine, 149, 289 
mesh of, 200 
norm of, 200 
tagged, 149, 200 
Peak, 80 
Periodic decimal, 51 
function, 148 
Piecewise linear function, 147 
Pigeonhole Principle, 357 
Point: 
boundary, 332 
cluster, 125, 329 
interior, 332 
Pointwise convergence, 241 
Polynomial: 
functions, 131 
Taylor, 189 
Positive class P, 26 
Power, of a real number, 159, 
258 
functions, 258 
series, 282 ff. 
Preservation: 
of Compactness, 339, 346 
of Intervals, 139 
Primitive of a function, 216 
Principle of Mathematical Induction, 
12 ff. 
Product: 
Cartesian, 4 
of functions, 111 
of sequences, 63 
of sets, 4 
Rule, 164 
Theorem, 222 
Proof: 
by contradiction, 356 
by contrapositive, 355 
direct, 354 
indirect, 355 
Proper subset, 2 
Properly divergent sequence, 
92 ff. 
Property, 2 
p-series, 97, 98 


Q 


Q (= collection of rational numbers), 2 


Q.E.D., 355 
Quantifiers, 352 
Quod erat demonstratum, 355 
Quotient: 
of functions, 111 
of sequences, 63 
Rule, 164 


R 


R(= collection of real numbers), 2, 23 ff. 


Raabe’s Test, 274 
Radius of convergence, 283 
Range, of a function, 5 
Rational numbers Q, 2, 25, 51 
function, 131 
power, 159 
Ratio Test, 69, 272 
Real numbers R, 2, 23 ff. 
power of, 159, 258 
Rearrangement Theorem, 269 
Reciprocal, 24 
Reductio ad absurdum, 356 
Remainder in Taylor’s Theorem: 
integral form, 223, 299 
Lagrange form, 190 
Repeating decimals, 51 
Restriction, of a function, 10 
Riemann, Bernhard, 198, 288 
Integrability Criterion, 360 
integral, 201 ff. 
sum, 200 
Riesz-Fischer Theorem, 307 
Rolle’s Theorem, 172 
Root(s): 
existence of, 43, 157 ff. 
functions, 10, 43 
Location of, 137 
Newton’s Method, 193 ff. 
Test, 271 


S 
Schoenberg, I. J., 368 
Schwarz inequality, 225 
Second Derivative Test, 191 
Semimetric, 346 
Seminorm, 306 
Sequence(s), 55 ff. 
bounded, 63 


Cauchy, 85, 344 

constant, 55 

contractive, 88 

convergent, 56 

difference of, 63 

divergent, 56 

Fibonacci, 56, 89 

of functions, 241 ff. 

inductive, 55 

limit of, 56 

monotone, 70 ff. 

multiple of, 63 

product of, 63 

properly divergent, 92 

quotient of, 63 

recursive, 55 

shuffled, 80 

subsequence of, 78 

sum of, 63 

tail of, 59 

term of, 55 

unbounded, 63 

uniform convergence of, 243 
Series, 94 ff., 267 ff. 

absolutely convergent, 267 

alternating, 278 

alternating harmonic, 98, 267 

conditionally convergent, 

267 

convergent, 94 

of functions, 281 ff. 

geometric, 95 

grouping of, 268 

harmonic, 88, 96, 267 

hypergeometric, 277 

power, 282 ff. 

p-series, 97, 98 

rearrangements of, 269 

sixless, 277 

Taylor, 285 ff. 

2-series, 97 

uniformly convergent, 281 ff. 
Set(s): 

boundary point of, 332 

bounded, 37, 347 

Cantor F, 330 

Cartesian product of, 4 

closed, 327, 345 

closure of, 333 

cluster point of, 125, 329 


INDEX 


compact, 334 
complement of, 3 
contains/contained in, 1 
countable, 17, 357 ff. 
denumerable, 17 
disjoint, 3 
empty, 3, 16 
equality of, 2 
finite, 16 ff., 357 ff. 
inclusion of, 1 
infimum of, 37 
infinite, 16 ff. 
interior of, 332 
interior point of, 332 
intersection of, 3 
intervals, 46 ff. 
measurable, 325 
null, 220 
open, 327, 345 
relative complement of, 3 
supremum of, 37 
symmetric difference, 11 
unbounded, 37 
uncountable, 17 
union of, 3 
void, see Empty set 
Shuffled sequence, 80 
Signum function, 109, 127 
Simpson’s Rule, 238, 365 
Sine function, 263 
Sixless series, 277 
Space-filling curve, 368 
Square root of 2: 
calculation of, 75 
existence of, 41 
irrationality of, 26 
Square root function, 10, 43 
Squaring function, 10 
Squeeze Theorem, 66, 114, 209, 
294 
Statement, 348 
Step function, 145 ff., 210 
Straddle Lemma, 171 
Strong Induction, 15 
Subcover, 333 
Subsequence, 78 
Subset, 1 
Substitution Theorems, 220, 224, 
297 
Subtraction in R, 25 
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Sum: 
iterated, 270 
lower, 226 
of functions, 111 
partial, 94, 281 
Riemann, 200 
of sequences, 63 
of a series, 94 
upper, 226 
Supremum, 37 
iterated, 46 
Property, 39 
Surjection, 7 
Surjective function, 7 
Syllogism, Law of, 354 
Symmetric difference, 11 


T 
Tagged partition, 149, 200 
Tail, of a sequence, 59 
Tautology, 349 
Taylor, Brook, 188 
polynomial, 189 
series, 285 ff. 
Taylor’s Theorem, 189, 223, 299 
Terminating decimal, 51 
Test: 
first derivative, 175 
for absolute convergence, 270 ff. 


for convergence of series, 96 ff., 257 ff. 


nth derivative, 191 

nth Term, 96 
Thomae’s function, 128, 206 
Translate, 207 
Trapezoidal Rule, 235 ff., 364 
Triangle Inequality, 32 
Trichotomy Property, 26 
Trigonometric functions, 260 ff. 


U 
Ultimately, 59 
Uncountable, 17 
Uncountability of R, 49, 52 
Uniform continuity, 142 ff., 152 
Uniform convergence: 

of a sequence, 246 ff., 316 

of a series, 281 ff. 
Uniform differentiability, 180 
Uniform norm, 244 
Union of sets, 3 
Uniqueness Theorem: 

for finite sets, 16, 357 

for integrals, 201, 290 

for power series, 285 
Universal quantifier V, 352 
Upper bound, 37 

integral, 227 

sum, 226 


Vv 

Value, of a function, 5 

van der Waerden, B. L., 367 
Vertical Line Test, 5 

Void set, see Empty set 


W 
Well-ordering Property of N, 12 
Weierstrass, Karl, 102, 124, 163 
Approximation Theorem, 148 
M-Test, 282, 367 
nondifferentiable function, 163, 
367 


Z 

Z (= collection of integers), 2 
Zero element, 24 

Zero measure, see Null set 


